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(Volume 28, Number 1, September 1959) 


THE DEVELOPMENT OF SCALES OF 
ATTITUDINAL DIMENSIONS * 


T. BENTLEY EDWARDS and ALAN B. WILSON 


University of California 


IT HAS BEEN suggested!** that the attitudes 
of students toward school subjects might be illu- 
minated by analyzing the habitual orientation of 
students toward decision or choice situations ac- 
cording to two dimensions. One dimension is 
provided by considering the degree of deliberation 
involved in making a choice: the preference for 
ends suggested by abstract, relational considera- 
tion, as contrasted with the appreciation of imme- 
diate, proximate ends; the other dimension is pro- 
vided by the relative preference for social as op- 
posed to non-social objects. 

The intersection of these two dimensions yields 
four vectors: 1) ‘‘theoretic,’’ the deliberative, 
analytic orientation toward the non-social environ- 
ment, characterized by interest in the natural 
sciences and in mathematics; 2) ‘‘prudent,’’ the 
deliberative orientation toward the social environ- 
ment, characterized by interest in the behavioral 
sciences; 3) ‘‘aesthetic,’’ the appreciation of the 
quality of the directly given non-social environ- 
ment; and 4) ‘‘immediate,’’ the preference for 
proximate social ends. 2 

The ranges between the poles of these four vec- 
tors provide six continua for the assessment of 


the structure of attitudes: 1) prudent vs. theoretic, 


2) prudent vs. immediate, 3) prudent vs. aes- 
thetic, 4)theoretic vs. immediate, 5) theoretic 
vs. aesthetic, and 6) aesthetic vs. immediate. 
These abstractions are summarized by the design 
in Figure 1. 

If anindividual who selects relatively proximate 
ends in a variety of situations may be reliably ex- 
pected to select proximate ends in a future choice 
situation, characterizing him as ‘‘immediate’”’ (or 
‘aesthetic, ’’ in a non-social context) would be of 
value for the prediction of behavior. More im- 
portant for educators, however, is the possibility 
that analysis of decision behavior with the aid of 
these dimensions may provide a useful theoretical 
perspective for the evaluation of curricular objec- 
tives. An instrument which ‘‘measures’”’ an indi- 


Berkeley, 


* The research reported herein was performed pursuant to a contract with the United States Office of 
Education, Department of Health, Education and Welfare. 


**A)) other footnotes will be found at end of article. 


California 


vidual’s position on the c ontinua could be used to 
evaluate the effects of curricular modification. 

In this paper we will descr ibe efforts to con- 
struct six scales in sucha fashion that scale scores 
obtained by individuals will be valid indexes of 
their relative positions on the six continua. 


Construction of the Scales 


A large number of short propositions express- 
ing preference for one of two alternatives were 
written. The content of each alternative was de- 
signed to represent one pole of one continuum, and 
was paired, in the proposition, withan alternative 
relatively approximating the opposite pole, e.g., 
prudent vs. theoretic. From thisfund of proposi- 
tions, thirty were selected for each of the six 
scales, making 180 items in all for the six scales. 
The following criteria governed the editing and se- 
lection of items: 


1. Theoretical relevance of content 

2. Clarity of meaning 

3. Appropriateness of vocabulary and content 
for high school students 

4. Balance of proportion of ‘‘positive’”’ vs. 
‘‘negative’’ items (one pole of each continu- 
um was arbitrarily designated positive or 
favorable; hence disagreeing with anegative 
item would be a positive response) 

5. Avoidance of dyslogistic phraseology or al- 
ternatives counter to cultural universals. 

6. Avoidance of factual items, responses to 
which might not reflect attitudes. 


These criteria are comparable to those suggested 
in current literature? on attitude scales except that 
each item contains two ideas, usually explicitly 
stated, though sometimes implicit, to be com- 
pared. It would seem, however, that this greater 
grammatical complexity is offset by the simplicity 
of decision in such comparison. 
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FIGURE 1 


SCHEMATIZATION OF CONTINUA RESULTING FROM CONC URRENCES OF 
VARYING COMBINATIONS OF TWO DIMENSIONS 
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The 180 selected items were randomized with 
the provision that no two items from the same 
scale should be adjacent. The mimeographed ‘‘In- 
ventory of Choices’’ was administered totwo 
groups of individuals: 92 high-school students in 
a required biology course, composed of four sec- 
tions, roughly stratified by the school according 
to IQ and prior achievement; and 50 undergradu- 
ate university students in two sections of an edu- 
cation course. Responses were indicated ona 
Likert-type six-point scale: ‘‘strongly agree, ’’ 
‘*moderately agree, ’’ ‘‘slightly agree, ’’ ‘‘slightly 
disagree, ’’ ‘‘moderately disagree, ’’ and ‘‘strong- 
ly disagree. ”’ 

Response categories were assigned the arbi- 
trary weights, 5-0, with strongly agree to posi- 
tively phrased items, or strongly disagree to neg- 
atively phrased items, being assigned the weight 
5, etc. Where a respondent couldnot make a de- 
cision, he was instructed tocross out the number 
of that item. Such infrequent responses were as- 
signed the intermediate weight, 2. Scores on 
each scale for each individual were obtained by 
summing these arbitrary weights. 

Following Louis Guttman’s ‘“‘Cornell tech- 
nique, ’’4 twelve tables were laid out, one for each 
scale for eachof the twosamples separately. Each 
table was constructed with one column for each of 
Six response categories of each item, and one row 
for eachindividual inthe sample. Thus, since each 
scale had thirty items and each item contained six 
response categories, each table consisted of 180 
columns and as many rows as individuals in the 
sample. The individuals were ordered, high to 
low, for each scale, according to their summated 
scores, and listed vertically down the left side of 
each table. The responses of each individual to 
each item were indicatedby X’s inthe appropriate 
cells, the resulting tables containing all of the 
data gathered. 

The array of responses to eachitem was double- 
dichotomized: vertically by combining response 
categories (e.g., 5-3 considered positive, 24-0 
considered negative); horizontally at the point 
where the frequency of positive responses was 
greater above the line, and negative responses 
greater below the line. The responses in each of 
the four resulting quadrants were then tabulated 
in 2 x 2 contingency tables. The association be- 
tween high scores (dependent on positive responses 
to a majority of items) and positive responses to 
each item, was tested by the chi-squared test cor- 
rected for continuity. Two values of chi-squared 
were thus obtained for each item—one value from 
each sample, providing replicationfor judging the 
reliability of the association in differing groups. 
Figure 2 illustrates these contingency tables. 

All of the associations in the contingency tables 
were ‘‘positive,’’ the values of chi-squared ex- 
ceeding the .05 level of significance inthe major- 
ity of cases. Examination of the array of re- 


sponses to each item in the original tables, and 
their summarization in the contigency tables, 
points to the following observations: 1) while there 
was considerable ‘‘higgledy-pigglediness’’> in the 
response patterns, the central tendency of the re- 
sponses to each item lay on a diagonal from upper 
left (high scores and high category values) to lower 
right (low scores and lowcategory values); 2) the 
confirmation of the ‘‘prediction’’ of anindividual’s 
response to each item from his responses to all 
other® items confirms (a) the direction of the a pri- 
ori weights originally assigned to the response 
Categories of eachitem ‘; and(b) by the same token, 
the existence of a communality of content through 
all the items. 

Twelve items were selected from the thirty of 
each scale on the basis of the following criteria: 


1. The level of discrimination in the high-school 
group indicated by the value of chi-squared, 

2. The reliability of the discrimination indi- 
cated by the replication with the university 
group, 

3. The variation in the distribution of item mar- 
ginal totals, indicating the ‘‘popularity’’ of 
positive responses, in order to provide sev- 
eral ‘‘cutting points’’ in the continua. 


The twelve selected items from each of the six 
scales were then again randomized to construct a 
revised ‘‘Inventory of Choices,’? composed of 
seventy-two items. 


Conditions for the ‘‘Measurement’’ of Attitudes 


In a uni-dimensional scale, ‘‘... persons who 
answer a given question favorably all have higher 
ranks on the scale than persons who answer the 
same question unfavorably.’’9 Alternatively, an 
individual who has a higher scale score than an- 
other, has given positive responses to all of the 
items which were positively responded to by the 
latter, plus one or more additional items. This 
requirement gives rise to Louis Guttman’s stipu- 
lation that the responses to each item shall be a 
simple function of the persons’ ranks;!0 thatthe 
responses to each item shall be reproducible from 
the scale score. If the items are arranged in or- 
der of popularity, aperson with a scale score of 3 
will have given positive responses to the three 
most popular items, etc. Thus, if we are to say 
that one individual is more ‘‘prudent’’ than another, 
we should mean that the former has given prudent 
responses to all the items responded prudently to 
by the latter, and more. If this condition be ob- 
tained, an ‘‘ordinal’”’ or ‘‘ rank order’’ scale would 
be constructed. 11 

With a perfectly cumulative scale, the fre- 
quencies in the upper-negative and lower-positive 
quadrants of the 2 <x 2 contingency tables, such as 
are illustrated in Figure 2, would be zero. No 
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persons with scale scores above a Certain point sponses to each item increases continuous- 


would give negative responses to an item, nor ly as one descends the order of scale scores, 
would any persons with scale scores below that then each item participates in a communal ity of 
point give positive responses to the item. Scores content, and contributes to the discrimination 
obtained with the use of such perfect scales would of ranks in the same dimension. ! 

include no ‘‘errors.’’ Designating obverse re- Were items representing a different dimension 
sponses as ‘‘errors’’ does not mean that one could obversely related to the rest of the items in a lin- 
confront the respondent with his ‘‘error’’ and tell ear fashion throughout the range of scale scores, 
him that he was mistaken in his attitude—‘‘ You so that the higher the scale score the greater the 
didn’t really mean this.’’ Each response must be frequency of negative responses, then the obverse 
assumed to define the respondent’s attitude toward association in the condensed contingency tables, 
the content of the item. Hence these ‘‘errors’”’ or the diminished value of chi-squared, would re- 
are departures of the data from the model of uni- veal this without a detailed examination of the pat- 
dimensionality. They indicate multi-dimension- tern of responses through the range of scale scores 
ality in the content of the items. Confronted with However, if such an item were positively related 
the complexity of an item as a whole, the respond- to the rest of the scale through the greater part of 
ent’s decision is based on the factors specific to the range of scale scores, but obversely related in 
each item—not necessarily on the factor common some limited range—as if, for example, those 


to all the items of a scale. most strongly ‘‘prudent,’’ along with those who are 


If two or three major dimensions exist ina ‘‘immediate,’’ as indicated by total scores, should 
scale then the interpretation of agivenscale score give ‘‘immediate’’ responses to one item, while 
is ambiguous since one respondent may have at- those who are moderately ‘‘prudent’’ should give 
tained his score by positive responses toitems of the ‘‘prudent’’ responses—the condensation of 
one dimension, while another respondent attained these responses into a 2 X 2 table would obscure 
the same score by positive responses to items this curvilinear relationship, while a significant 
representing another dimension. For example, if value of chi-squared might still be obtained. 
income and education are combined as an index of We shall be concerned, inthe preliminary anal- 
socio-economic status, the meaning of a given ysis of the scales, therefore, not only with the as- 
rank is indeterminate since it may have been at-~ sociation of high scale scores’ with positive re- 
tained by either income or education, andthese sponses to each item, to determine whether each 
need not be interchangeablefor the valid predic- item of the scale contributes to the discrimination 
tion of an external variable. 12 of ranks in the same dimension; but also with the 

On the other hand, if each item has specific pattern of responses todetermine whether this es- 
factors, drawing idiosyncratic responses, but all sential uni-dimensionality obtains throughout the 
of the items of ascale share a single common range of scale scores. 
factor, then the rank order established by the Afterwards, if the scales prove tobe essential - : 
group of items should rank individuals approxi- ly uni-dimensional, even though errors may be 
mately in accord with the ‘‘ideal’’ ranking based numerous, then the cancellation of specific factors 
on the common factor alone—specific factors may be enhanced, and an approximately cum ula- 
more or less cancelling one another. tive scale constructed, by grouping equivalently 

Since, inthe formulation of the items for these popular items and defining positive responses to a 
scales, a deliberate effort was made to have each majority of each group of items as a positive re- 
item represent a continuum inadifferent context, sponse to the compound item. By the use of this 
each item is obviously multi-dimensional. ‘‘Er- ‘‘H-technique’’ of Samuel Stouffer, 14 the dominant 
rors’’ are necessarily to beexpected. But by ex- factor is brought out more clearly by making al- % 
amining the pattern of responses we should be lowance for some idiosyncratic variation. This 
able to make apreliminary judgment as to wheth- reduces the number of errors, thus enhancing the 
er we have a single dominant factor and numerous possibility of reproducing scores from ranks and 
specific factors, inwhich case the ordering of in- vice versa, but it also reduces the number of pos- 
dividuals would rest essentially on the dominant sible ranks to which an individual may be assigned, 
factor; or whether two or three major dimensions In a sense, the technique helps the investigator 


exist in the scale. 


use an instrument within its powers of resolution. 
If a large group of individuals who have high 


scale scores—having given positive responses to Preliminary Analysis of the Revised Scales 

a majority of items—give negative responses to 

one or two items, while a group of individuals The revised inventory of seventy-two items, 
with lower scale scores give positive responses consisting of the twelve selected items for each of 
to the same one or two items; then those items, the six scales, was administered to 325 high- 
having blocked groups of errors, represent a dif- school students in college-preparatory chemistry 
ferent dimension from the rest of the scale. On and physics classes. 


the other hand, if the frequency of negative re- The responses were again tabulated into six 
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tables, using the ‘‘Cornell technique’’ as described 
above. The matrix of responses to each item was 
double-dichotomized and tabulated in 2 x 2 contin- 
gency tables. The association of high scale scores 
with positive responses was significant beyond the 
-001 level for each item. The values of chi- 
squared are reported in the last row of Tables I- 
VI for the items of each scale. 

Each individual was re-scored, assigning the 
weights 1 and 0 to the new categories formed by 
the vertical dichotomization. This resulted ina 
possible range of scale scores from 12 (positive 
responses to all items) to 0 (negative responses 
to all items). The number of positive responses 
to each item given by the individuals with each 
scale score was then tabulated. These are con- 
verted into proportions and reported in the body 
of Tables I-VI. (The ultimate positive frequencies 
can be recovered from the product of the propor- 
tion and the frequency at each scale score: p: fo; 
and the negative frequencies from (1 - p): fo .) 

It can be seen from inspection that the propor- 
tions of positive responses to each item form an 
approximately continuous gradient, such that the 
lower the scale score, the smaller the proportion 
of positive responses, subject to sampling fluctu- 
ations. 15 Therefore we may judge that each item 
makes a significant contribution tothe discrimin- 
ation of ranks along the same dimension. 

However, the frequency of errors, attributable 
to specific or random factors, and the presence 
of reversals in the gradient of proportions of pos- 
itive responses in adjacent ranks (see footnote 15), 
indicates that the amalgamation of ranks by the 
combination of ‘‘equivalent’’ items would not be 
wasteful of any relevant discriminations. 


Re-Scoring by the ‘‘H-Technique”’ 


In order to minimize the reflection in the scale 
scores of idiosyncratic responses to specific fac- 
tors, ‘‘equivalent’’ items were combined to form 
the elements of compound items. The combina- 
tion or grouping was done so that 1) the elements 
of each compound item should be as similar as 
possible to each other, and 2) the compound items 
should be as different as possible from one an- 
other. 

The cumulative proportion of responses in each 
category of each item was computed and the twelve 
items of each scale were then ranked in order of 
increasing popularity of positive responses. Three 
compound items were formed from the items of 
each scale, the four least popular items forming 
the elements of compound item A, ...the four 
most popular items forming the elements of com- 
pound item C. 

The respones to the elements of each com- 
pound item were then dichotomized so that approx- 
imately equal proportions of the responses to the 
elements of any one compound item should be pos- 


itive. At the same time, about 25 percent of the 
respondents should give positive responses to 
three or four of the elements of compound item A, 
about 50 percent should give positive responses to 
three or four of the elements of compound item B, 
and about 75 percent should give positive re- 
sponses to three or four of the elements of com- 
pound item C. In this way, the elements of each 
compound item are as equivalent as possible, and 
the resulting compound items are as different as 
possible from each other, ‘‘equivalence”’ and ‘‘dif- 
ference’’ being defined by the population of re- 
spondents rather than by a priori judgment. This 
procedure is illustrated by the data from the pru- 
dent-theoretic scale in Table VII. 

A positive response to each of the compound 
items is defined by a modal positive response to 
the elements of that item. In other words, posi- 
tive responses to all four, or tothree out of the 
four, elements of a compound item are scored as 
a positive response to the compound item; negative 
responses to two, three, or four of the four ele- 
ments are scored as a negative response to the 
compound item. 

The 325 individuals in the high-school group 
were then re-scored according to their modal re- 
sponses tothe compounditemson each scale. The 
frequency of each possible pattern of response to 
the compound items of the sixscales is shown in 
Table VII. 


Analysis of the Modified Scales 


It can be seen that the patterns of responses 
given by the majority of individuals were cumula- 
tive. If they gave a positive response to the least 
popular item, A, they also gave positive responses 
to the other two items; if they gave a positive re- 
sponse to item B, they also gave a positive re- 
sponse toC; and, vice versa, if they gave a neg- 
ative response to C, they also gave a negative re- 
sponse to A and B, ifthey gavea negative response 
to B, they also gave a negative response to A. 
For these ‘‘scale-types ’’ it is possible to say that 
an individual witha score of, for example, 2, 
made a positive response to the item responded to 
positively by an individual with a score of 1, and 
one more. Moreover, it is possible to determine, 
from the scale scores, which items an individual 
made positive responses to. 

Inspection of the frequencies of non-cumulative 
response patterns shows that the greater the dis- 
crepancy of the pattern from the uni-dimensional 
pattern, the less frequently the pattern is ob- 
served. For example, of the 93 individuals who 
gave positive responses to only one item of the 
prudent-theoretic scale, 79 gave the positive re- 
—— toitem C, 12 to item B, and 2 to item 
A. 

The frequencies of positive and negative re- 
sponses to each item by the individuals in each 
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TABLE VIII 


FREQUENCY OF RESPONSES OF EACH PATTERN TO THE ITEMS OF EACH SCALE 


Response 
Pattern 


Frequency in Each Scale 


Score ABC 


P-A T-I 


72 35 


40 48 
62 87 
15 

1 

73 

12 

5 

84 


*Non-cumulative response patterns, classed as errors. 


rank are recorded in the condensed scalograms of 
Tables IX-XIV. For item A of eachscale positive 
responses by individuals having scores of 2-0 are 
‘‘errors’’; for item B, negative responses by in- 
dividuals having a score of 2 and positive re- 
sponses by individuals having a score of 1, are 
‘‘errors,’’ etc. 

A coefficient of reproducibility is the propor- 
tion of responses which fit the cumulative patterns. 
A coefficient of reproducibility of .92 means that 


from an individual’s scale score we can reproduce 


his responses to each item with 92 percent accu- 
racy (by predicting the cumulative pattern of re- 
sponses). The coefficients of reproducibility for 
each scale are reported in the ‘‘total’’ column of 
the last rows of Tables IX-XIV. The figures un- 
der each item apply to the items separately. These 
coefficients are seentorange from .90-.95, which 
values are within the bounds established by Gutt- 
man for adequate reproducibility. 17 

Thus, within the limits of error described by 
the coefficient of reproducibility, the scale scores 
validly order the individuals along the continua 
defined by the content of the items. It is possible, 
with 90-95 percent accuracy, to say, for example, 
that a more ‘‘prudent’’ individual has made pru- 
dent responses to all items responded to prudent- 
ly by a less prudent individual, and more. 

The coefficient of reproducibility describes 
the confidence with which we may assume that a 
givenscale score represents a particular cumula- 
tive pattern of responses. But it does not describe 


the confidence with which we may reject the pos- 
sibility that the cumulative pattern of responses 
might have been obtained by chance were the items 
actually independent rather than cumulatively in- 
terdependent. 

For example, with three independent items, if 
3/4 answer the most popular item, C, positively, 
1/2 answer the next most popular item, B, posi- 
tively, and 1/4 answer the least popular item, A, 
positively; then 1/2 of 3/4 = 3/8 would answer both 
B and C positively by chance, if the items had no 
relationship, and 1/4 of 3/8 = 3/32 would answer 
all three positively by chance, etc. Thus, the 
expectation for each response pattern for inde- 
pendent items can be computed from the product 
of the proportions giving each response. 

If the items are strictly cumulative, however, 
then all those who give positive responses to A will 
also give positive responses toB andC. Thus 1/4 
or 8/32, rather than 3/32, will give positive re- 
sponses to all three items; none will give non-cu- 
mulative response patterns, etc. 

The expected proportions obtained from these 
hypothetical distributions under the two assump- 
tions of 1) independence of items, the null hypoth- 
esis, Hg, and 2) cumulative interdependence of 
items, He, are tabulated in Figure 3. Comparing 
the proportions under Ho and He we can predict 
the direction, or sign, of differences to be antici- 
pated between the expected frequencies under HQ 
and actually observed frequencies, if the direction 
of departure from independence is toward cumula- 


P-T P-] T-A A-I 
| 
2 ~++ 88 103 104 78 
« 
* 11 15 20 15 
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1 -- + 79 107 1 83 107 
Re *-+- 12 17 18 12 
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TABLE Ix 


PRUDENT-THEORETIC SCALE 


FREQUENCY OF POSITIVE AND NEGATIVE RESPONSES TO EACH ITEM OF THE PRUDENT- 
THEORETIC SCALE BY THE INDIVIDUALS OF EACH SCALE SCORE 


Items 


B 


Pos- 
itive 


Fre- 
quency 


Neg- 


Score ative 


Pos- 
itive 


Pos- 
itive 


Neg- 
ative 


Neg- 
ative 


61 61 0 


99 11 88 


93 2 91 


72 0 72 


61 


88 


0 61 0 
11 
81 


72 


Total 74 


Errors 


CofR 


tive interdependence. These signs are listed in 
the last column of Figure 3 for the proportions 
used here as an example. 

In Tables XV-XX the chi-squared test has been 
applied to the goodness of fit of the obtained fre- 
quencies to the expected frequencies under the as- 
sumption of independence, Ho, for each of the six 
scales. In each case the probability of obtaining 
the observed distribution, were the items actually 
independent, was less than .001. Therefore, we 
can reject the possibility that the items are inde- 
pendent with the confidence that we should be wrong 
in making such decisions fewer than one time in a 
thousand. 

Moreover, thedirection of the interdependence 
is substantially in the direction predicted under 
the assumption of cumulative interdependence. 
The contributions to the value of chi-squared of 
differences which are not in the predicted direc- 
tion have been segregated into the last column of 
each of the tables. They can be seen to be negli- 
gible—in no case affecting the level of significance 
of the value of chi-squared. 


Summary 


It was the purpose of this research project to 
construct an instrument whereby individuals might 


be validly ordered along six theoretical continua. 
It was determined that ‘‘order’’ should be assumed 
to exist only if individuals’ decision behavior 
evinced a particular cumulative structure, so that 
a person at any given position on the continuum 
would make the ‘‘favorable’’ decisions made by all 
persons ‘‘lower’’ on the continuum, plus one or 
more additional ‘‘favorable’’ decisions. 

Because of the complexity of considerations in- 
volved in the process of making decisions in the 
variety of contexts offered as ‘‘items,’’ there was 
considerable ‘‘higgledy-pigglediness’’ in the pat- 
terns of the responses; the many discriminations 
being based on differences in a large number of 
factors, rather than cumulative differences ina 
uni-dimensional factor. 

By a process of repeated cons olidation of re- 
sponses: first, by dic hotomizing the original six 
responses into positive and negative responses, 
and then by combining items, and dichotomizing 
the responses to the elements into positive and 
negative responses to the compound items, it was 
found possible to elim inate most of the discrim- 
inations based upon idiosyncratic factors. 


The remaining discriminations coarsely ranked 
the individuals in a cumulative order, so that one 
might say that an individual with a given scale 


17 ‘ 
“4 
ae 
3 |_| 183 
1 12 279 
0 0 0 72 216 
4 ee 251 161 164 239 86 975 
Boe P| 13 0 12 11 0 14 50 
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FIGURE 3 


A PRIORI EXPEC TED PROPORTIONS UNDER THE ALTERNATIVE HYPOTHESES 
OF INDEPENDENT ATTRIBUTES AND OF CUMULATIVE ATTRIBUTES 


Predicted 
Response Expected Proportion Expected Proportion Sign. of 
Pattern Ho He Difference 


4 2 


32 


Note: A indicates the proportion of positive responses to item A; B, the positive re- 
sponses to B; C, the positive responses to C; a, the negative responses to A; 
B, the negative responses to B; 3, the negative responses to C. 
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TABLE X 
PRUDENT-IMMEDIATE SC ALE 


FREQUENCY OF POSITIVE AND NEGATIVE RESPONSES TO EACH ITEM OF THE PRUDENT- 
IMMEDIATE SCALE BY THE INDIVIDUALS OF EACH SCALE SCORE 


Fre- Pos- Pos- Neg- Pos- Neg- 
quency itive itive ative itive ative 


42 42 42 42 
122 19 
126 2 


35 0 


Total 325 63 


Errors 


CofR 


TABLE XI 


PRUDENT-AESTHETIC SCALE 


FREQUENCY OF POSITIVE AND NEGATIVE RESPONSES TO EACH ITEM OF THE PRUDENT- 
AESTHETIC SCALE BY THE INDIVIDUALS OF EACH SCALE SCORE 


Items 
B 


Fre- Pos- Neg- Pos- Neg- 
Score quency itive ative itive ative 


40 40 0 40 0 
84 22 62 65 19 
4 134 
0 


Total 66 


Errors 


CofR 


* 

19 
Items 

A B 
: Score Total 
2 4 366 
1 19 378 
RE 0 Pe 35 0 35 0 35 105 
oy pl 262 166 159 267 58 975 
21 0 17 15 0 23 76 
— 
Pos- Neg- 
a itive ative Total 
Wigs 2 81 3 252 
1 1 119 19 414 
ay 
ate 0 0 63 189 
rb ey po 259 120 205 240 85 975 i 
cae — 26 0 15 19 0 22 82 
| 
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TABLE XII 
THEORE TIC-IMMEDIATE SCALE 


FREQUENCY OF POSITIVE AND NEGATIVE RESPONSES TO EACH ITEM OF THE THEORETIC- 
IMMEDIATE SCALE BY THE INDIVIDUALS OF EACH SCALE SCORE 


Items 
B 


Fre- Neg- Neg- 
Score quency ative ative 


48 0 0 


87 15 


85 78 


84 84 


Total 177 


Errors 15 


CofR 


TABLE XIII 
THEORETIC- AESTHETIC SCALE 


FREQUENCY OF POSITIVE AND NEGATIVE RESPONSES TO EACH ITEM OF THE THEORETIC- 
AESTHETIC SCALE BY THE INDIVIDUALS OF EACH SCALE SCORE 


Items 


B 


Fre- Pos- Neg- Pos- Neg- 
Score quency itive ative itive ative 


42 42 42 
23 
6 


0 


Total 71 


Errors 


CofR 


20 
eat 
itive ative Total 
3 48 0 144 
2 103 16 102 1 309 
1 90 5 ee 73 17 270 pas 
0 84 0 0 84 252 
223 102 975 
0 18 66 
.94 .92 94 . 93 

a 

| 
itive ative Total 
3 0 42 0 126 
2 20 124 3 381 
1 1 89 83 24 321 
0 49 49 0 49 0 49 147 
254 167 158 249 76 975 
29 0 18 20 0 27 94 
.91 . 88 .92 . 90 
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score made all of the ‘‘favorable’’ responses of 
individuals with lower scale scores, plus one or 
more additional ‘‘favorable’’ responses, with 90- 
95 percent accuracy. It was possible to reject, 
with a high degree of confidence (p< .001), the 
possibility that these cumulative patterns of re- 
sponse could have arisen with such frequency by 
chance were the items independent. 

The instrument has validity, based upon its in- 
ternal cumulative structure, forthe rank ordering 
of individuals along the six continua, which should 
be of value in relating these variables with any 
external variables. 
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contribution of the item being predicted. 
The lack of inde pendence resulting from 
the inclusion of the ‘‘predicted’’ score in 
the total score does not substantially alter 
its significance where the number of items 
is large. 


. Shirley A. Star. ‘‘TheScreening of Psy- 
choneurotics inthe Army: Technical De- 
velopment of Tests,’’ Studies in Social 
Psychology in World War Il, Vol. TV, Meas- 
urement and Prediction, ed., by Samuel A. 
Stouffer (Princeton: Princeton University 


Press, 1950), p. 494: ‘‘... in no case did 
the scale or quasi-scale scoring of an item 
prove to be the reverse of the empirically 
found differences on that item. This out- 
come indicated, first, that our a priori 
judgment had been correct and, second, 
that the scale scoring, even on a dichoto- 
mous basis, preserved the predictive effi- 
ciency of the individual items....”’ 


8. Since the phi-coefficient is di rectly propor- 
tional to chi-squared when N is constant, 
this basis of selection is analogous to that 
recommended by Allen L. Edwards and F. 
P. Kilpatrick, ‘‘A Technique for the Con- 
struction of Attitude Scales,’’ Journal of 
Applied Psychology, XXXII (1948), 374-84, 
except that observed marginal totals rather 
than Thurstone scale values were used to 
obtain a range of items. Actually this last 
criterion was applied loosely, and, asa 
result, many ‘‘middle’’ items were select- 
ed, discriminating at the samelevel. This 
condition was corrected by the redefinition 
of positive responses inthe subsequent im- 
provement of the scales, infra. 


9. Samuel A. Stouffer. ‘‘An Overview of the 
Contributions to Scaling and Scale Theory,’’ 
Measurement and Prediction, op. cit., p. 9. 
Cf. Jane Loevinger. ‘“‘A Systematic Ap- 
proach to the Construction and Evaluation 
of Tests of Ability,’’ Psychological Mono- 
graphs, LXI (1947), No. 285; and “The 
Technic of Homogeneous Tests Compared 
with Some Aspects of ‘Scale Analysis’ and 
Factor Analysis,’’ Psychological Bulletin, 
XLV (1948), pp. 507-29; G. Murphy, L.B. 
Murphy, and T. M. Newcomb, Experi- 
mental Social Psychology (New York: Harp- 
er and Bros., 1937), p. 897. 


10. Louis Guttman. ‘‘A Basis for Scaling Quali- 
tative Data,’’ American Soc iological Re- 
view, IX (April 1944), pp. 139-50; ‘*The 
Basis for Scalogram Analysis, ’’ Measure- 
ment and Prediction, op. cit., pp. 60-90 


11. John Dewey. Logic: The Theory of Inquiry 
(New York: Henry Holt and Co., 1938), p.. 
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with the kind of collection just said to have 
the property of totality in contrast with a 
merely numbered aggregate. ... Measured 
collections involve (1) limits from which 
to which; (2) something specified as a unit 
for counting; and (3) progressive accumu- 
lation of these units until the limit ad quem 
is reached. The word accumulation as 
here used involves something different 
from the aggregation found in the merely 
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13. Cf. 


numerical set.... 

cumulative aspect in genuine col- 
lective propositions signifies that such 
propositions depend upon some principle 
of arrangement or order which is derived 
from the involved means-consequences re- 
lation....”’ 

Cf. also with Morris R. Cohen and 
Ernest Nagel, AnIntroduction to Logic and 
the Scientific Method (New York: Harcourt, 
Brace and Co., 1934), especially Chapter 
XV, Section 5, ‘‘The Formal Conditions 
for Measurement,’’ pp. 297-98. 


12. Qualitatively diverse elements may be oper- 


ationally equated and give rise to actuarial 
indexes which prove valid for a specific 
problem of external prediction. However, 
they are not theoretically pregnant since 
the grounds for their validity is ambiguous, 
i.e., they are not internally valid in the 
sense of ‘‘measuring’’ what they purport 
to measure. Interpretationleads to falla- 
cious ‘‘inverse operationism.’’ Cf. Clyde 
H. Coombs, ‘‘Theory and Methods of So- 
cial Measurement, ’’ Research Methods in 
the Behavioral Sciences, ed. by Leon Fes- 
tinger and Daniel Katz (New York: Dryden 
Press, 1953), pp. 471-535. 


Louis Guttman. ‘‘The Utility of Scalo- 
gram Analysis,’’ Measurement and Pre- 
diction, op.cit., p. 159: “*... This gra- 


dient pattern of errors indicates that, while 
there is not a single factor operating as in 
the case of a scale, nevertheless there is 
a single dominant factor and indefinitely 


many small random factors, so that the 


prediction of any external variable must 
rest essentially on the dominant factor...” 
See also ibid. , pp. 160-63, 207, 208, 458, 
494, and 547. 


14. Samuel A. Stouffer, and others. ‘“‘A Tech- 


nique for Improving Cumulative Scales, ’’ 
Sociological Studies in Scale Analysis, ed. 


by Matilda White Riley and others (New 
Brunswick: Rutgers University Press, 
1954), pp. 372-89. 


15. In each case where a higher proportion of the 


individuals in the lower of two adjacent 
ranks gave positive responses to an item, 
the frequency of positive and negative re- 
sponses in the two ranks was tabulated in- 
to a 2 x 2 contingency table. The null hy- 
pothesis Ho, thatthe difference in propor- 
tions of positive and negative responses 
between the two groups, in this reversed 
direction, could have occurred by chance 
were thereno relevant differences between 
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the ranks, was tested by: 1) the chi-squared 
test where the total number of individuals 
was greater than 40; or between 20 and 40, 
provided that the expected frequency in 
each cell was greater than 5; or 2) the 
Fisher-Yates test of exact probability 
where the number was less than 20; or be- 
tween 20 and 40, where the expected fre- 
quency in some cell was less than 5. 

In all but two of the cases it was impos- 
sible to reject Ho at the .05 level. Hence, 
the null hypothesis is tenable for those 
cases. Thetwoexceptions are shown in 
Figure 4, next page. 

Reversals of this order, which could 
have occurred approximately 2.5 times in 
a hundred by chance, occurred twice in the 
860 pairs of adjacent ranks of the six scales. 
However, the presence of these reversals 
confirms the fact, which our interpretation 
of errors provides further grounds for af- 
firming, that the fineness of disc rimina- 
tion provided by the separate ranks are not 
valid, and may not be relevant to the com- 
mon factors in the continua. 


16. The ‘‘image’’ of each non-cumulative pattern 


is a cumulative pattern—in most instances 
of the next lower scale score. See Louis 
Guttman, ‘‘Image Theory of the Structure 
of Quantitative Variates, ’’ Psychometrika, 
XVIII (December 1953), pp. 277-79; and 
‘The Israel Alpha Technique for Scale 
Analysis, ’’Sociological Studies in Scale 
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Analysis, op.cit., pp. 410-15. 


17. The fact that assignment of frequencies to each 


of the eight response patterns can be made 
with only four degrees of freedom may not 
be immediately apparent. A three-dimen- 
sional contingency table, 2 x 2 x 2, pro- 
vides the eight cells needed to house all 
possible combinations of positive and neg- 
ative responses to each of the three items. 
This is illustrated in Figure 5. 

Given the marginal totals of the fre- 
quency of positive and negative responses 
to each of the three items, then the joint- 
frequencies in only four cells may be as- 
signed freely (providedthe sum of the fre- 
quencies assigned to cells in the same 
plane do not exceed the marginal total of 
that plane), the frequencies in the remain- 
ing four cells being then determined by the 
marginal totals. 

For example, the front face of the cube 
holds the four cells, a, b, c, and d, which 
house the frequencies of the positive re- 
sponses to A. When three of these cells 
have been arbitrarily filled, using three 
degrees of freedom, the frequency in the 
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FIGURE 5 


2x 2 x 2 CONTINGENCY TABLE HOUSING THE FREQUENCIES OF THE 
JOINT OCCURRENCES AND NON-OCCURRENCES 
OF THREE ATTRIBUTES 
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fourth cell is determined by the marginal per face of the cube have now been deter- 
total of the positive responses to A. mined, hence the frequency in cell e is de- 
The right face of the cube holds the four termined by the total number of positive 
cells, b, d, f, and h, which house the fre- responses to B. Similarly, the frequency 
quencies of the negative responses toC. in g is determined by the total number of 
Two of these cells, b and d, have already negative responses to B, cells c, d, andh 
been filled; hence the arbitrary choice of having already been determined. 
the frequency of either, but not both, f or The solution above was simplified by 
h, is possible, the other being determined first determining three cells in the same 
by the total number of negative responses plane. However, it can be shown with 
to C. This uses one more degree of free- simple algebra that there are only four de- 
dom. grees of freedom regardless of the order 
The three cells, a, b, andf, of the up- in which the individual cells are filled. 


APPENDIX 


The items used in the inventory are listed below in order of the popularity of the ‘‘positive’’ response, 
and grouped to form the compound items of each scale. Each item is preceded by: 1) the number of the 
item in the inventory, 2) the proportion of individuals giving ‘‘positive’’ responses to the item, and 3) 
the range of responses designated as positive. The following notation is used to designate responses: 


A - strongly agree 
moderately agree 
slightly agree 
undecided or no response 
slightly disagree 
moderately disagree 
strongly disagree 


Prudent- Theoretic Scale 


A 


In science fiction stories, I like the ones with interesting scientific theories that hang 


together, even if they’re not completely true, better than those about the social prob- 
lems of space settlements. 


In studying about the building of the pyramids, I should be more interested in the en- 
gineering feat than in the class structure and economy of Egypt which made such mag- 
nificent display possible. 

Fabulous IBM machines are used to calculate insurance rates. However, data must 
be fed into the machines. The interviewing techniques for collecting data interest me 
more than an explanation of how the machines work. 

I would rather teach science than do research. 


B 


In our complex industrial civilization a young person should specialize early and stick 
to it. 


When I see an article about ‘‘electronic brains’’ I am more interested in finding out 
how they work than what their uses are. 


I should rather read and be able to understand William Shakespeare’s Hamlet than 
Michael Faraday’s Experimental Researches in Electricity. 
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I like history and civics much better than science and mathematics. 
Cc 


I am more interested in finding out how TV has affected people’s taste than in finding 
out how TV works. 


I am more interested in following newspaper reports on the recent discoveries regard- 
ing ‘‘negative matter’’ than on the developments in racial integrationin the schools. 


I would rather study algebra than history even though algebraseems to be almost to- 
tally unrelated to any other subject. 


If I had an hour to wait for a train I should more likely read The Scientific American 
than The Atlantic Monthly. 


Prudent-Immediate Scale 


A 


In a social studies course I would rather have the reasons why the U. S. didn’t join 
the League of Nations explained to me than try to figure it out. 


I should prefer the live theater to movies if they were the same price. 


Sometimes when a fellow is out with the gang, he pretty well has to do a few things he 
knows he really shouldn’t. 


Mercy killing should be legalized for cases of extreme suffering where there is no 
hope for cure. 


B 


When talking with my friends in the evening I’d rather talk about people we know and 
have fun with than talk about religion or philosophy. 


The opinion of friends helps more than reading in making up my mind. 


The foreign policy of our government should be based on high moral principles even 
though this may entail a loss of strategic power or prestige. 


I frequently think about the reasons for other people’s misbehavior instead of react- 
ing with irritation. 


Cc 


I never worry about how things are going to work out—they usually seem to take care 
of themselves. 


An impulsive person is warm and sincere; one who analyzes his emotions is cold anu 
**phony. 


A business man should make his decisions strictly according to the interests of his 
business. He should not worry about what happens nationally to wages and prices. 


Science has definitely not been able to show that colored races are inferior to white 
races. 
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Prudent-Aesthetic Scale 


A 
Nobody should be allowed to cut down the redwood forests and turn them into lumber. 


If pay, housing, etc., were equal, I should like the work of a forest ranger better 
than that of a minister. 


I should rather take a ‘‘shop’’ course than a ‘‘world history’’ course. 


If I were a musician the thing I should like best about it would be getting across to the 
audience the basic idea of the composer. 


B 


I find paintings interesting when I am able to see how they represent the artists’ atti- 
tudes toward life. 


Art should be appreciated intuitively. Analysis destroys its beauty. 
1 like to watch a big house afire. 


To my way of thinking, the need to keep a city beautiful to look at is the most impor- 
tant argument in favor of smog control. 


Cc 


Because they need to get close to life, artists are entitled to special consideration if 
they treat lightly the ties of marriage. 


Medical experiments using live animals are cruel and inhuman. 


Visiting a foreign country I would want to see the pageantry and architecture 
so I would not be interested in knowing in advance about their customs and history. 


Instead of developing expensive tastes, what I would like most to get from my educa- 
tion is either a purpose for my life or an affirmation of my present purposes. 


Theoretic-Immediate Scale 


A 


When I’m studying math or science it is refreshing to take frequent breaks watching 
TV or talking with a friend. 


When I’m watching a movie I sometimes lose track of the plot because I’m wondering 
how the lighting and stage effects are worked. 


I can visualize myself reading a paper at a scientific society meeting but not chatting 
socially in the corridor while a meeting is in progress. 


I usually like to do math problems alone rather than discuss them with others. 
B 


School mathematics courses should concentrate more on practical consumer and bus- 
iness training. 


I would sooner have a big living room for parties than have a workroom for hobbies. 


34 
64 . 29 F-G 
at 
= 
29 . 50 F-G 
70 .54 F-G 
Wee 
; 
RE 4 

22 .45. A-C 
bis 


EDWARDS - WILSON 35 


If I were employed by a company manufacturing chemicals, I would rather stay in re- 
search than become a company executive, so long as the loss in pay was not too great. 


I prefer a science class that is run along fairly formal lines so that I can avoid the 
distractions arising from personal entanglements with other members of the class. 


Cc 


I enjoy working hard at a science project even if others don’t recognize my accom- 
plishment. 


I should rather be elected to the Student Council than be selected as an honor student 
in science. 


I don’t like being interrupted while I’m doing | aboratory experiments by friends who 
feel like talking. 


I would rather be known as the writer of a social column published in many papers than 
as the Director of an astronomical observatory. 


Theoretic-Aesthetic Scale 


A 
If I were interested in studying flowers, I should be attracted chiefly by the beauty of 
the flowers. Comparing the structure of different kinds of flowers would not interest 
me much. 


I prefer chess to checkers. 


When you go on an automobile trip it is much more fun to pick places to stay as you 
go along rather than writing ahead for reservations. 


I never wonder how the time is going when I’m painting a room or sawing firewood 
like I do when I’m studying math or physics. 


B 


Chemistry experiments are fun to watch so long as there are plenty of explosions and 
color changes. 


A person should throw himself into life with vitality—the scientists’ reflection on how 
things work is a wet blanket on the spontaneous pleasures of affection. 


When I was little I liked erector sets more than tops. 


It’s a sloppy sailor who’ll let his sails flap while he basks in the sun and breathes the 
crisp salty wetness of the air; the keen sailor watches the wind, studies thetides, and 
understands details of the rigging. 
Cc 
Scientists destroy much of the beauty of nature when they explain away its mysteries. 
The forces of nature are subjects for wonder and awe—not analysis. 


‘*Hotrod’’ racing would be fun if you didn’t have to know about and work on motors. 


When I look at the stars at night I sometimes meditate on the way the universe works. 
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Aesthetic-Immediate Scale 


A 
I would rather go sailing by myself than watch a football game. 


If I had the necessary athletic prowess I should prefer to excel inthe cross-country 
marathon than in football. 


11 .41 A-D_ I like to ride alone. The feel of a good horse under me, his strength and his rhythm, 
more than make up for the lack of fellow riders. 


Abilities of sign writers are different from the abilities of men who run asign busi- 
ness. If I had the ability to do either, I would rather learn to run the business than 
paint the signs. 


B 


I spend more of my free time on hobbies like stamp collecting, woodwork, etc., than 
going to parties or entertaining friends. 


Sailing on a boat would be fun with a group of people, but I don’t think I’d care much 
for it by myself. 


I like Dixieland jazz better than ‘‘rock and roll. ’’ 


A girl should wear sweaters and ‘‘pearls’’ or whatever most of the girls are wearing 
rather than conspicuous hand-made jewelry. 


Cc 
I should prefer to be a machinist rather than a salesman. 


I enjoy swimming in the ocean by myself, or, for safety, with a companion or life 
guard, more than swimming in a pool. 


I think I should enjoy Longfellow’s poem ‘‘Evangeline’’ more if it were told as a love 
story in modern prose. 


55 -86 B-G  WhenI am buying clothes I pay less attention to the ones that don’t show than to the 
ones that do show. 
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STUDIES IN SOCIAL PERCEPTION: 
WORD PRODUCTIVITY 


FRANK J. ESTVAN* 
University of Wisconsin ‘ 


THIS IS A study of the length of children’s oral 
responses to a series of pictures. It is one phase 
of research dealing with children’s social percep- 
tion in which picture projection was employed. 
Word Productivity is of concern to workers in the 
field of projective testing who are faced with the 
problem of interpreting various measures of pro- 
ductivity. It has also been studied by specialists 
in human development and language arts whose 
findings seem to indicate that urban children are 
more verbal than rural, that girls excel boys in 
language development, and that children with high 
I.Q.’s are more verbally adept than those with 
less intelligence. 

The present study is based on the assumption 
that behavior is caused, and that word count should, 
therefore, reveal something about the individual. 
Acceptance of this assumption would make tenable 
several explanations for the number of words pro- 
duced in response to the series of pictures used 
in this research. One would be that Word Pro- 
ductivity is an index of interest or effort, some - 
times referred toas subject cooperation. Another 
is that Word Productivity is a reflection of the 
meaningfulness of a situation to the individual. 
Third, Word Productivity may be an indicator of 
the pictorial quality or provocativeness of a pic - 
ture irrespective of its content. 

To test these hypotheses, comparisons have 
been made of the Word Productivity of rural and 
urban children, boys and girls, first and sixth 
grade pupils, as well as high I.Q. and low LQ. 
groups. If differences among these groups can be 
explained in terms of what is already known about 
children or by a new set of constructs, the useful- 
ness of Word Productivity as an aid in the inves- 
tigation of social perception will have been estab- 
lished. 

It must be noted that the verbal behavior being 
examinedis not what an individual knows about 
words, but the number of words he uses to com- 
municate ideas. This distinction is the same as 


that between knowing how well a childcan read, : 


and what he does read. The meaningfulness of 
the responses elicited in this study will be found 


* All footnotes will be found at end of article. 


in another source. 1 


Source of Data2 


The data analyzed in this report were obtained 
in Part I of the Social Perception Interview. A 
Life-Situation Picture Series consisting of 14 
black and white 2 x 2 inch slides was projected on 
a 30 x 40 inch screen, the directive for each stim- 
ulus being ‘‘What story does this picture tell?’’ 
The subject’s remarks were tape recorded and 
typewritten verbatim to yield a highly accurate 
record of his free-association responses. 

Of the total of 88 subjects selected at random, 
half were drawn in equal proportions from the 
first and sixth grade children attending one-room 
schools inarural Wisconsin county. The other 44 
subjects were similarly selectedfrom the first and 
sixth grade pupils attending the public elementary 
schools of a Wisconsin community of approximately 
55,000 population located in a highly urbanized 
region. Sex was equated within each grade level 
for both rural and urban samples, thus providing 
eight subgroups of 11 subjects each. Intelligence 
groups were formed by selecting the two persons 
having the highest Intelligence Quotient in each of 
the eight subgroups to constitute the high-ability 
sample, and using similar procedures at the op- 
posite extreme to form the low-ability group. 

The determination of Word Productivity in- 
volved more than the mechanical counting of a suc- 
cession of words. In many instances, considera- 
tion had to be given to the context in which an ex- 
pression occurred. To standardize procedures, 
the following guides were established: 


A. Counted as one word 

1. Any word contained in a standard English 
dictionary. 

2. Slang expressions or c 01 loquialisms—cop, 
yeah, sorta’, lotsa’. 

3. Contractions—isn’t, there’s, ain’t. 

4. Compound words— right-handed. 

5. Abbreviations— YMCA, O.K., ABC’s. 

6. Numbers below 100. 
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7. Partial words when the context indicated 
their use as a complete word—‘‘semi’”’ 
(semi-trailer truck), ‘‘spose’’ (Suppose), 
‘‘veyor’’ (surveyor). 

8. Repetitions when they were part Of the grow- 
ing perception of the subject—‘‘boy...I see 
a boy’”’ (each word counted), ‘‘and...and a 
gisi.”’ 


B. Not counted 

1. Sounds—ah, um, mm. 

2. Incompleted words whichoccurred when the 
subject changed his mind after he had begun 
a word—‘‘sa...’’ (sad), ‘‘peo...’’ (people). 

3. Repetitions or stammering unrelated to 
thought processes—‘‘and...and... and...’’ 
(latter two not counted), ‘‘a...a...a...a..2” 
(latter three not counted). 


In order to eliminate observer influence, no 
attempt was made to estimate the number of words 
missed in those few instances when responses 
were not audible on the tape recorder. Neither 
was the observer required to determine when the 
child had stopped tal king about one life-situation 
picture and was signalling for the next, such re- 
marks as ‘“‘that’s all,’’ ‘‘O.K.,’’ ‘‘I don’t know 
any more’’ being included in the word count for 
the picture being viewed on the screen. 


Does Word Productivity Indicate Pupil 
Involvement ? 


This is the first and most basic question which 
needs to be answered. If the differences in Word 
Productivity among the pictures in the series are 
so small as to reflect the operation of chance, a 
lack of personal involvement on the part of the 
subjects could be inferred. This would tend to in- 
validate the data secured in the picture-story por- 
tion of the Social Perception Interview, and cast 
doubt on all the procedures which followed. 

It is conceivable that the life-situation pictures 
might be equally stimulating in content or form of 
presentation to the subjects in this study. The 
fact that these situations vary in: a) social back- 
ground representation (rural-urban, high status- 
low status, child-adult), b) social functions de- 
picted (10 basic life activities), c) scope or com- 
plexity of situations (general environment-specif- 
ic social function), and d) child-figure presenta- 
tion (clearly definedor ambiguous) would seem to 
lessen such a possibility. For all these variables 
to average out among the 14 pictures in approxi- 
mately equal word responses is not likely except 
that subjects react ina random manner rather 
than to the elements in each situation. 

Table I presents the variance in Word Produc- 
tivity of the 88 subjects to each of the 14 life-sit- 
uation pictures. The test for homogeneity of 
variance establishes these differences at a very 


high level of confidence. This means that the var- 
iability in Word Productivity among the 14 pictures 
is too great to consider these responses as be- 
longing to the same population. In short, the dif- 
ferences in Word Productivity among the pictures 
in the series would appear to be real and not 
simply due to chance. The subjects in this study 
were not responding to the Life-Situation Picture 
Series in random fashion. 


Is Word Productivity Related to Certain 


Pupil Abilities? 


In considering the significance of Word Produc - 
tivity, the possibility exists that the total number 
of words produced is simp|y an index of general 
intelligence or a more specific verbal factor. It 
may be that individuals have developed a general- 
ized verbal behavior pattern, as far as quantity is 
concerned, which is more influential in determin- 
ing Word Productivity thanare the pictures them- 
selves. The broader base of general intelligence 
must also be examined inthis connection, because 
of the high relationship which usually prevails be- 
tween scores on intelligence tests and tests of 
verbal abilities. 

A high correlation between Word P ro ductivity 
and intelligence or verbal facility would discount 
the meaningfulness of the subject’s reactions to 
the life situations. The quantitative aspect of his 
responses could then be attributed to an underly- 
ing ‘‘native’’ ability ingeneral or to amore specif- 
ic verbal factor. If no relationship were found 
between these measures, one would have greater 
confidence in regarding differences in Word Pro- 
ductivity as being assoc iated with differences in 
the meaningfulness that these situations have for 
elementary school children. 

Measures of intelligence and verbal ability 
were obtained from the Stanford-Binet Intelligence 
Test, Form L. The Vocabulary score is the num- 
ber of words for which correct definitions were 
given until six consecutive words were missed. 
These data were available for all 88 subjects in- 
cluded in the study, and may be regarded as a 
qualitative aspect of verbal behavior. The Word 
Fluency score is the number of words named in 
one minute and, thus, is a quantitative measure 
of verbal facility. The latter is Item 5 for Year 
X, and was atte mpted by 55 percent of the first- 
grade children and 73 percent of the sixth-grade 
children, thus providing sufficient data for analy- 
sis. 

As is indicated in Table II, none of the correl- 
ations between Word Productivity and a) Mental 
Age, b) Intelligence Quotient, c) Vocabulary Score 
and d) Word Fluency Score are statistically sig- 
nificant for the first grade, the sixth grade, or 
for both groups combined. In every instance, 
however, the correlation of Vocabulary Score with 
Mental Age and Intelligence Quotient is signifi- 
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TABLE I 


HOMOGENEITY OF VARIANCE IN WORD PRODUCTIVITY OF TOTAL GROUP 
FOR LIFE-SITUATION PIC TURE SERIES 


Picture 


Name 


Bedroom 125839 


Hovel 227797 
Dam 253489 


Factory 206103 


Mansion 253121 
Church 144502 
Farm 203506 
Swimming Hole 211741 
Capitol 192746 
City 205728 
Resort 211676 


Schoolroom 217738 


Village 237508 


Dock | 287607 


S*max 1289. 50 


For k = 14 and n = 87: P(Fmax > 1.80) = .01** 
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TABLE II 


CORRELATION OF WORD PRODUCTIVITY AND MEASURES OF INTELLIGENCE 
AND VERBAL ABILITY 


First Grade Sixth Grade 
2 2 


985** . 993** 
. 683** 


. 254 100 


Total Group 
Key: 2 


1— Word Productivity 
2— Mental Age 

3— Intelligence Quotient 
4— Vocabulary Score 
5— Word Fluency Score! - 482** 


'N = 24 first grade and 32 sixth grade . - 931** .411** 
instead of 44 for each grade as for , 
other measures. 569** -.217* 


* P< .03 
**P<.01 
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cant beyond the one percent level of confidence. 
For the total group of 56 subjects who responded 
to the item, the correlation of Word Fluency Score 
with Mental Age and Intelligence Quotient is like- 
wise significantly different from zero. This is 
also true with respect to Word Fluency Score and 
Intelligence Quotient in the First Grade group. 
The fact that Word Productivity is not related 
to level or rate of intelligence as well as to qual- 
itative or quantitative measures of verbal ability, 
would support the conclusion that word responses 
to the Life-Situation Picture Series reflect some- 
thing other than these measures. The Social- 
Perception Interview, from the standpoint of word 
productivity, can make no claim to being another 
form of intelligence test or test of vocabulary. 


ls Word Productivity Related to Character- 
istics of the Instrument? 


Differences in Word Productivity might also 
be explained in terms of certain characteristics 
of the instrument rather than being attributed to 
differences in the meaningfulness of the life situ- 
ations to subjects. It could bethat while the sub- 
ject-matter of a picture is interesting to elemen- 
tary-school children, its pictorial treatment has 
little appeal. There is also the poss ibility that 
the placement of apicture in the series will effect 
the length of verbal responses elicited. Word 
Productivity of the beginning and latter pictures 
in the series may simply be indications of the need 
for a ‘‘warming-up”’ period or fatigue due to the 
length of the series. 

Pictorial Representation—One way to examine 
the first question is to determine the relationship 
in Word Productivity among the various pictures 
in the series. A significant negative correlation 
between one picture and the others in the series 
would call for extended analysis to discover the 
factors which might explain such deviate respon- 
ses. These could be matters of form or content. 
The intercorrelations are positive and statistical - 
ly significant, thus lending a high degree of con- 
fidence in a relationship that is not zero. Of the 
91 correlations, 30 are +.80 or higher, the high- 
est being between Picture No. 10 (City) and Pic- 
ture No. 13 (Village) which are the most clear- 
cut paired-comparisons inthe series. Except for 
Picture No. 1(Bedroom), each picture correlates 
+.80 or higher with one or more pictures; Picture 
No. 12 (Schoolroom) correlating with nine other 
pictures at this level of relationship. 

The generally high positive correlation in Word 
Productivity among the 14 life-situation pictures 
indicates that no one picture stands apart from the 
rest of the series because of extremely high or 
low stimulating properties. Although the pictures 
in this series vary in their power to elicit word 
responses from elementary school children, this 
difference is not so extreme in any one picture as 
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to question its suitability in content or pictorial 
qualities. 

Picture Sequence— The fact that the first pic- 
ture in the series is the only one which did not 
correlate as high as +. 80 withothers in the series 
is one reason for investigating the relationship be- 
tween Word Productivity and PictureSequence. A 
significant negative correlation could mean that 
subjects gained in confidence and interest and, 
hence, increased in Word Productivity as they 
proceeded through the series—or that the pictures 
toward the end of the series were more provoca- 
tive. A high positive correlation, on the other 
hand, could mean that subjects became fatigued 
or lost interest as they proceeded, that the latter 
pictures in the series were less stimulating, or 
that subjects became more direct and efficient in 
expressing themselves as they gained practice. A 
lack of relationship between Word Productivity and 
Picture Sequence would indicate the absence of 
systematic influences such as ‘‘warming up,’’ fa- 
tigue, or practice effects, and an evenness in the 
stimulating quality of the life-situation pictures. 

Table IV indicates that for the total group of 88 
subjects there is no relationship between Word 
Productivity and the order in which pictures are 
presented. This is also true for the comparative 
groups listed in Table V with the exception of the 
High Intelligence group. In this case, both rho 
and Kendall’s tau are significantly different from 
zero at the 4 percent level of confidence. If the 
pictures at the end of the series were more stim- 
ulating, we would expect anincrease in Word Pro- 
ductivity from all groups. In the absence of this 
tendency, a more plausible explanation is that 
bright children profit greatly from experience 
and, therefore, reach higher levels of perform- 
ance as they progressthroughthe series. For el- 
ementary-school children, in general, it is clear 
that differences in the Word Productivity of vari- 
ous pictures are not related to their position in 


‘ the series. 


It is interesting to note in Table V that the rank- 
ings for Picture No. 1 (Bedroom) and Picture No. 
6 (Church) differ by only one point among the eight 
subgroups, and that they rank 14th and 13th re- 
spectively. Whether scenes of a child in bed and 
people going to church seem so obvious to children 
as to need no prolonged comment or whether they 
are less interesting or meaningful than the other 
pictures can only be surmised. For each of the 
remaining 12 pictures, however, the rankings 
made by the comparative groups differ from 4 to 
9 steps, the average rank-difference being 7. 


Is Word Productivity Related to Differences in 
Social Background and Intelligence? 


Meaningfulness is a function of the kind of ex- 
periences which an individual has, and his ability 
to learn from these experiences. Accordingly, it 
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TABLE IV 


CORRELATION OF WORD PRODUCTIVITY AND PIC TURE SEQUENCE 
FOR TOTAL GROUP 


Words 
Picture 


Total 


2619 
3747 
3877 
3391 
3989 
2906 
3448 
3399 
3296 
3438 
3474 
3380 
3494 
3929 


Total 48387 


rho = -. 147 
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TABLE V 


CORRELATION OF WORD PRODUCTIVITY AND PICTURE SEQUENCE 
FOR COMPARATIVE GROUPS 


Community Intelligence 


Picture Rural Urban High* Low* 


14 14 14 13 
4 8 2 
4 


12 


5 5 9 
3 1 1 1 


-. 033 -.275 -112 -. 293 . 143 -. 332 -.556** -.041 


* N = 16 instead of 44 as for other groups. 


**For rho = — .556 **Kendall’s tau = — .429 
om — 2.321 z= — 2.080 
P< P .04 


**For computing rho, t, and Kendall’s tau see: Maurice G. Kendall, Rank Correlation Methods 
(London: Charles Griffin and Co., Ltd., 1948), Chaps. I and IV. 
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would be reasonable to expect that children hav- 
ing different social backgrounds suchas those as- 
sociated with rural-urban living, sex, and age 
(grade) would respond differently to a variety of 
scenes of American life. For the same reason, 
the reactions of bright children might differ from 
those who are less intelligent. 

Community and Sex Differences—To determine 
whether Word Productivity is related to type of 
community lived in and tosex, analyses were 
made of the four first-grade subgroups (1RB, 1RG, 
1UB, 1UG) and another of the four sixth-grade 
subgroups (6RB, 6RG, 6UB, 6UG).? Tests of ho- 
mogeneity of variance foreach picture, summar- 
ized in Table VI, indicate that with one exception 
in the first grade, and two in the sixth, the sub- 
groups did not differ significantly in the variabil- 
ity of their responses. The first-grade urban 
girls contributed dis proportionately to the vari- 
ance for Picture No. 14 (Dock) and the sixth- 
grade rural boys accounted for the great vari- 
ability in responses to Picture No. 4 (Factory) 
and to Picture No. 8 (Swimming Hole). In all 
three instances, sex produced greater variance 
than community lived in. The results of the ana- 
lysis of variance for the first-grade subgroups and 
for the sixth-grade subgroups are presented in 
Table VIL, and the main effects variance in Table 
Vl. In none of the 56 analyses (14 pictures x 2 
variances X 2 grade groups) was the F ratio sta- 
tistically significant. This is convincing evidence 
that the locality in which the individual lives and 
sex are not related to the mean number of words 
given in response to the life-situation pictures. 

Grade Differences—A comparison of the Word 
Productivity of the 44 first-grade subjects and the 
44 sixth-grade subjects is presented in Table IX. 
Except for Picture No. 1 (Bedroom) and Picture 
No. 2 (Hovel), the mean number of words pro- 
duced by the sixth-grade pupils was greater than 
that for the first. In two cases, Picture No. 5 
(Mansion) and Picture No. 14 (Dock), this differ- 
ence was great enoughto be considered significant 
rather than due to sampling fluctuations. In both 
groups, the Mansion scene ranked high in Word 
Productivity, the sixth grade placing it second 
and the first grade ranking it third. Whereas the 
Dock scene ranked highest in the sixth-grade group, 
it placed only seventh in the first grade. 

It would appear that both groups have come to 
realize the significance of living in wealthy cir- 
cumstances, and that the difference between the 
first and sixthgrade onthe Mansion picture is one 
of degree due to the fact that sixth grade children 
have had a longer periodof acculturation. To un- 
derstand why the greatest difference between the 
grade groups should appear with reference to the 
Dock scene, it is important to note that this pic- 
ture was planned to depict human relations ona 
world-wide scale. That sixth grade children 
were more stimulated to talk about world relation- 
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ships than were first graders, fitsin with the com- 
monly held notion that children proceed from the 
‘‘near’’ to the ‘‘far’’ in their development of inter- 
ests. 

Intelligence Differences—In Table X, the Word 
Productivity of 16 high I.Q. subjects and 16 low 
I1.Q. pupils are compared for each of the life-sit- 
uation pictures. In every case, the mean Word 
Productivity of the former group exceeds that of 
the iatter. For six of the 14 pictures, the differ- 
ence is Statistically significant and not due to 
chance. This is considerable evidence to the effect 
that bright children, irrespective of where they 
live, their sex, or grade are more stimulated to 
talk about certain life-situation pictures than are 
a comparable group of children whodiffer in being 
below average in intelligence. 

Why bright children should be more productive 
on the City (No. 10) and Farm (No. 7) pictures in 
the community pattern and to the Hovel (No, 2) and 
Resort (No. 11) in the social status block of pic- 
tures is not immediately apparent. The picture of 
a Dam (No. 3) and thatof the Capitol (No. 9), how- 
ever, are two of the three adult pictures designed 
to contrast age differentials in social experience. 
It is, therefore, not difficult to understand why 
children of highintelligence would be more stimu- 
lated by scenes depicting the conservation of natu- 
ral resources as in the Dam and of citizenship val- 
ues as reflected in the Capitol building than would 
be the case with less intelligent children. 


Are Extreme Deviations in Word Productivity 
for Each Individual Related to Differences in 
Social Background and Intelligence? 


In the foregoing analyses, Word Productivity 
has been examinedfrom the point of view that sig- 
nificance is in direct proportion to magnitude. Sig- 
nificant behavior, however, may be expressed by 
avoidance as well as attraction, and either may be 
prompted by the same underlying causes. With 
reference to Word Productivity, this would be in- 
dicated by an extremely meager word count or by 
an exceptionally productive response. A determ- 
ination of deviant behavior in these terms would 
have to take into account the wide range of individ- 
ual differences in general level of Word Produc - 
tivity. A word count of 63, for example, might 
indicate a high response for one individual, and 
average or below average productivity for another 
person. Hence, the requirement that each indi- 
vidual’s word responses constitute the basis for 
this type of analysis. 

To determine what is a deviant response for an 
individual, the assumption must be made that word 
responses tothe 14 pictures are normally distrib- 
uted. It is then possible to determine mean Word 
Productivity and variability inword responses for 
the series. If the responses be yond +1 SD from 
the individual’s mean Word Productivity are des- 
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TABLE VI 


HOMOGENEITY OF VARIANCE IN WORD PRODUCTIVITY OF FIRST 
AND SIXTH GRADE SUBGROUPS FOR EACH PIC TURE 


lst Grade 6th Grade 
Picture Fmax 


2. 69 


2.17 


For k = 4 and N = 10: 
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2.97 
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ignated as exceptional for him, approximately 32 
percent or an average of 4.4 of his picture re- 
sponses would be included inthe negative and pos- 
itive tails of thedistribution. If cut-off points are 
established at +2 SD from the mean, somewhat 
less than 5 percent or . 64 pictures would be sin- 
gled out. (In actual practice, the latter resulted 
in the identification of such unusual responses in 
one of two subject’s responses.) 

Total Group— The number of instances when 
the Word Productivity for a picture was beyond 
+1 SD and +2 SD from each individual’s mean is 
Shown for the total group of 88 subjects in Table 
XI. Although there is no appreciable difference 
between the total number of very limited re- 
sponses and those showing great productivity at 
the +1 SD level, the difference is highly signif- 
icant for the +2 SD scores. This tendency for ex- 
tremely marked variability to be in the direction 
of over- rather than under-production is, there- 
fore, one which may be held with a high degree of 
confidence. Its logic stems from the fact that 
whereas the zero limitation applies to the negative 
end of the distribution, there is no ceiling for re- 
sponses falling in the positive side of the scale. 

At the +1 SD level of variability, five pictures 
show a statistically significant difference between 
the number of times they evoked little or much 
productivity. In Picture No. 2 (Hovel), Picture 
No. 5 (Mansion), and Picture No. 14 (Dock), the 
direction is positive whereas in Picture No. 1 
(Bedroom) and Picture No. 6 (Church) the direc- 
tion is negative. This is in keeping with the find- 
ings presented in TableIV showing these pictures 
to rank 4, 1, 2, and 14, 13 respectively in Word 
Productivity. 

Differences between the +2 SD scores for each 
picture are not statistically significant or are too 
small to be tested. It is interesting to note, how- 
ever, that except for Picture No. 3 (Dam), the 
Bedroom and Church scenes are the only ones for 
which such an extremely limited response was 
forthcoming. The Bedroom situation, however, 
is also among the highest producers at the +2 SD 
level which is not the case for the Church scene. 
It should also be noted that Picture No. 12 (School- 
room) is the only one which did not elicit at least 
one voluminous response from the 88 children 
concerned. 

The above findings support the generalization 
that elementary-school children are more stimu- 
lated to talk about conditions of poverty (No. 2) 
and wealth (No. 5) as well as far away places in- 
timated in the Dock scene (No. 14) and to be less 
stimulated by such immediate experiences as 
sleeping (No. 1) and going to church (No. 6). It 
must be remembered, however, that these can be 
held only as tentative conclusions for it is quite 
possible that another set of illustrations of the 
same life activities might result in different re- 


sponses. 
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Comparative Groups— Analyses of the preced- 
ing data in terms of community, sex, grade, and 
intelligence comparisons are given in Table XIl 
(+1 SD) and Table XIII (+2 SD). In no instance is 
the distribution of positive and negative deviations 
for two groups significantly different. Each ex- 
hibits the same general trends as for the total 
group reported in Table XI, or varies so little as 
to be considered a random fluctuation from these 
trends. 

When these same data are compared in terms 
of the social-background patterns on which the 
Life-Situation Picture Series was designed, cer- 
tain statistically significant differences appear in 
all but the rural-urban comparison (Table XIV). 
Girls were more productive on high social-status 
pictures, but boys responded better to low social- 
status situations (Table XV). Whereas first-grade 
children were less productive on adult scenes. 
sixth-grade pupils were less stimulated by pictures 
centered on child-experiences (Table XVI). The 
high intelligence group was more responsive to the 
adult pictures, and the low intelligence group re- 
sponded more freely to the child-experience pic- 
tures (Table XVII). 

That rural children would differ from urban 
children in their reactions to scenes portraying 
rural-urban differentials seems to be a reasonable 
expectation. Why such differences in Word Pro- 
ductivity failed to materialize cannot be accounted 
for at this time. It may be thatthis differential 
does not exist in Southeastern Wisconsin. The 
rural-urban groups being equated in sex and grade, 
no differences were expected in the reactions of 
these groups to the social status and age patterns, 
and none appeared. 

Why boys should talk at such great length to low 
social-status pictures can be explained, inpart, 
by the weighting produced by the Swimming Hole 
scene (No. 8) which might be more attractive to 
boys than to girls. The consistently greater re- 
sponse of the girls to the high social-status pic- 
tures is in keeping with developmental data bear- 
ing on the advanced social maturity of girls over 
that of boys. With this maturity could go an in- 
creased sensitivity to the status symbols depicted 
in the Mansion (No. 5) and Resort (No. 11) scenes. 
There being equal representation of rural-urban 
and first- and sixth-grade children in the sex 
groups, there would be no reason for anticipating 
differences in their responsiveness to the com- 
munity block of situations and the age pattern. No 
sex differences were discovered in these two pat- 
terns. 

That differences in grade and, hence, age 
should be reflected in different reactions to the 
child-adult pattern of life-situations can also be 
explained logically. As was indicated in the pre- 
ceding discussion on grade differences in total 
Word Productivity, this is in keeping with present 
knowledge of the increasing span or breadth of 
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TABLE XI 


PICTURES DEVIATING +1 SD AND +2 SD FROM INDIVIDUAL MEAN 
WORD PRODUCTIVITY FOR TOTAL GROUP 


SD 
Picture x? 


19. 105*** 


5. 765* 


13. 828*** 


8. 036** 


12. 500*** 


202 


19. 565*** 


Forldf: *P(X? > 3,841) =.05 
**P(X? > 6.635) 5.01 
***P(X? > 10.827) = .001 
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view that develops withage. In short, first-grade 
children haven’t yet broadened their interests to 
include more remote situations, whereas sixth- 
grade children no longer find great stimulation in 
a consideration of more immediate experiences. 
The equating of community and sex in the grade 
groups would discount the possibility of differences 
appearing in the community and social-status pat- 
terns, which prediction was borne out by the find- 
ings. 

The responsiveness of bright children to adult 
or remote experiences and that of less intelligent 
pupils to child or immediate experiences can like- 
wise be interpreted from the standpoint of the re- 
lationship between mental development and the 
growth of interests and concerns. These groups 
being equated in community, sex, and grade var- 
iables, there would be no basis for predicting dif- 
ferences withrespect tothe other patterns of life- 
situations, and none were found. 


Is Word Productivity Related to Picture 
Preference? 


Several hypotheses can be advanced regarding 
the verbal output of children on pictures which 
they like best or like least. One would be that 
children have a great deal to say about what 
pleases them, and very little about situations 
which they dislike. Itcould be, too, that children 
take for granted the desirability of their prefer- 
ences, but find it necessary to substantiate their 
rejections with extended statements. On the other 
hand, it may be that it requires as much verbal 
activity to defend choices as it does to substanti- 
ate rejections. 

Data for the examination of this question were 
provided in PartII] of the Social Perception Inter- 
view. The subject was asked to designate ‘‘the 
picture you wouldlike to keep for your very own’’ 
and to tell why, He was given a copy of this pic- 
ture and requested to select a second, and then a 
third in similar fashion. Finally, he was asked 
to point out ‘‘the picture you wouldn’t like to 
keep, ’’ and to tell why. 

Word Productivity for the first, second, third, 
and least preferred and the 10 pictures which were 
not mentioned is givenfor the total group in Table 
XVIII, and for the comparative groups in Table 
XIX. The skewness of each distribution toward 
the lower end of the scale in the former table 
corroborates the finding of the preceding section 
that deviant responses are more likely to be in 
the direction of greater productivity rather than 
less. 

Comparisons in mean Word Productivity for 
the various designations of likes and dislikes as 
well as for the combined preferences and non- 
selected pictures are given in Table XX.. The only 
instances of statistically significant differences 
occur in the comparison of the Intelligence groups. 
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The difference between the productivity of the 
third choice and least preferred picture is signif- 
icant at the 5 percent level, and that bet ween the 
second and least preferred is significant at the 1 
percent level of confidence. In both cases, the 
productivity of the rejected picture is higher than 
that of the preference pictures. 

The evidence indicates that, generally speaking, 
word output in the free-association responses 
(Part I) is no predictor of picture preferences in 
Part III of the Social-Perception Interview. It 
would seem that whatever the causative factors 
operating in Part I, they are notidentical to those 
influencing picture-selection behavior in Part Ill. 
It may be that telling a story about a situation 
draws upon the individual’s background of experi- 
ence whereas, in the selection of pictures, values 
and interests are brought into focus. The differ- 
ence is between what the subject knows and is fa- 
miliar with (Part I) and what he would like to have 
or hopes to be (Part III). Whether background or 
aspiration is involved, the same number of words 
is required for the expression of one as for the 
other. 

Although this explanation also applies to the 
Low Intelligence group, it does not explain why 
exceptions should occur in the relationship be- 
tween second and third choices with that of the 
least preferred picture. One explanation is that 
less intelligent children have more limited goals 
than bright children, and that once having select- 
ed the ‘‘best’’ picture they were less able or less 
motivated to discriminate among the remaining 
pictures for second and third choices. 


Conclusions 


A number of important conclusions may be 
reached regarding the free-association responses 
to the Life-Situation Picture Series. First, Word 
Productivity is an index of behavior. Pupils did 
not respond to the pictures in a random fashion; 
neither were word responses a reflection of cer- 
tain innate pupil abilities or characteristics of the 
instrument. Second, there is a relationship be- 
tween children’s social background andtheir Word 
Productivity on specific life-situation pictures or 
to patterns of these situations. These differences 
were related to intelligence, grade, and sex in de- 
scending order, there being none associated with 
the rural and urban populations sampled in this 
study. Third, the behaviors elicited in the free- 
association and picture-preference phases of the 
Social- Perception Interview are different. Where- 
as an analysis of Word Productivity in terms of 
children’s social background did reveal significant 
relationships, this was not the case for the anal- 
ysis in terms of picturechoices. This is in keep- 
ing with the design for the Social-Perception In- 
terview which was planned to bring about increas- 
ing involvement on the part of subjects. 
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TABLE XVIII 


WORD PRODUCTIVITY OF PICTURE PREFERENCES FOR TOTAL GROUP 


Picture Preference Total 


Non- 
Preferred Selected 
Words 3 Least. (lst, 2nd, 3rd) (N=10) 


210-219 1 
200-209 0 


190-199 


140-149 
130-139 
120-129 
110-119 
100-109 


90- 


65 


88 87* 263 882 
38.41 41. 67 39.72 39. 56 


29. 63 29, 08 30. 21 29. 57 
3.16 3.12 


*Preference not indicated by 1st grade rural girl and by lst grade urban girl respectively. 


| 
100-189..... 1 1 0 
xp 
170-179..... 0 0 0 
160-169..... 0 0 4 
150-159..... 0 1 1 2 1 
0 0 1 0 2 
| 0 0 1 0 4 
1 0 0 1 8 
1 3 1 4 11 
5 3 2 10 10 
0 3 2 5 20 
1 1 1 4 25 
70- 78..... 5 1 2 6 26 
ef 60- 69..... 6 5 4 2 15 49 Vs 
$0- 59..... 7 9 4 8 20 66 
i" 40- 49..... 6 12 12 12 30 100 ii, 
ea 30- 39..... 12 14 11 19 37 137 50g 
to 20- 29..... 18 15 20 10 53 183 eo 
a 10- 19..... 20 14 15 16 49 169 We: 
Mean 


TABLE XIX 


WORD PRODUCTIVITY OF PICTURE PREFERENCES FOR COMPARATIVE GROUPS 


Comparison 


Group 


Picture Preference 


Total 


Statistic Ist 


2nd 


3rd 


P 


Non- 
Selected 


referred 
(1, 2, 3) 


Community 


Rural 


44 
Mean 37.73 
SD 34.19 


5. 16 


131 441 
37. 82 38. 38 
33.21 31.53 


* 


N 44 44 44 43 132 441 

Urban | Mean | 39.09 43.41 44.55 40. 58 41.59 40.74 
SD 24.62 24.97 30.57 28.23 26.77 27.41 
SEy 3.71 3.72 4. 61 4.30 


44 132 

Boy Mean | 41.14 40.91 40.23 39.77 40.76 39. 48 
sD 32.15 28.47 33.38 28.08 31.05 30. 06 

SEM 4.85 4.29 5. 03 4.23 * * 

Sex N 44 44 43 43 131 442 
ian Mean 35.68 40.00 40.35 43.61 38. 66 39. 64 
sD 26.98 32.17 29.31 30.28 29. 30 29.07 

4.07 4.85 4. 47 4. 62 * * 


Grade 


44 131 
Mean | 34.77 34.77 38.72 41.05 | 36.07 36.15 
SD 26.53 27.32 35.32 34.24 | 29.74 30. 06 
SEy | 4.00 4.12 5.39 5,22 
N 44 44 44 44 132 440 
Staite Mean | 42.05 46.14 41.82 42.27 | 43.33 42. 98 
SD 32.10 32.15 27.00 23.36 | 30.23 28. 66 
4.48 4.85 4.07 3.52 


16 48 
Mean | 51.25 55.13 55.44 48.56 54. 38 51. 38 
SD 29.81 38.48 34.52 31.17 33. 61 30. 50 
SEM 1, 45 9. 62 8. 63 7.79 
inteltigence N 16 16 16 15 48 161 
Lowes | Mean | 35.75 25.38 29.06 35.53 30. 42 31.96 
SD 43.01 24.71 26.89 28.82 32. 02 33. 52 
SEM 10.76 6.18 6.72 7.44 


+ 
N, +No - 


* Because of the difference in N between Total Preferred and Total Non-selected pictures, SE} was 
not computed separately, but combined in formula: 


**N = 16 subjects instead of 44 as for other groups. 
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In short, Word Productivity provides one 
source of validation forthe instrument. Gross as 
this measure of human behavior may be, it reveals 
certain significant relationships which, with few 
exceptions, can be explained in terms of what is 
known about children. Equally important, an anal- 
ysis of word count, with the exceptions noted, has 
failed to produce relationships whe re none could 
reasonably be expected. 

A number of generalizations also pertain to the 
nature of Word Productivity itself. From the out- 
set, it was established that there is a difference 
between word knowledge, as commonly deter- 
mined by vocabulary tests, and the quantitative 
aspect of verbal behavior in spontaneous oral ex- 
pression. That an individual has a better grasp 
of word meanings is no indicator of how many 
words he will use in expressing himself. Also, 
it is clear, that broad generalizations about su- 
periority inlanguage facility are over-simplifica- 
tions. No one group of children showed consis- 
tently greater response or lack of response to the 
Life-Situation Picture Series. Word Productivity 
depends on what is being talked about. Studies of 
verbal facility will produce different results de- 
pending upon whether subjects are free to select 
topics for discussion or are required to deal with 
prescribed situations. 

Two cautions must be pointed out for the inter- 
pretation of the above findings. First, there is 
considerable indeterminancy regarding Word Pro- 
ductivity when analyzed from the standpoint of de- 
viations from each individual’s mean word re- 
sponse. Because of limited sampling groups 
many frequencies were zero or too small to be 
treated statistically. To clarify these indeter- 
minate areas, quite obviously, will require larg- 
er sampling procedures. 


The fact that none of the analyses revealed 
rural-urban differences raises the question of the 


applicability of these findings. Although the county 
being sampled has only five communities, none of 
which are over 15, 000 population, and all the sub- 
jects were farm children, does this constitute a 
rural sample? In similar terms, although the 
urban community is located in the second largest 
urban region in this country, do these children 
represent urban children? Cculd it be that chil- 
dren living in more isolated rural regions would 
differ from those living in a metropolitan com- 
munity of several million population? It may be 
that to compare rural and urban children in gen- 
eral is another instance of over-simplification, 
and is one explanation for the lack of agreement 
found in studies of these culture groups. 


FOOTNOTES 


* For assistance with statistical computations, 
the writer wishes to acknowledge his indebted- 
ness to the following research assistants: Irvin 
J. Lehmann, Sobhi T. Geraissa, DonaldW. 
Hinkkanen, and Mohamed K. Hindy, as well as 
to R. James Evey, Project Supervisor of the 
Numerical Analysis Laboratory, University of 
Wisconsin. 
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tion: Methodology, ’’ Journal of Genetic Psy- 
chology, XCII (June 1958), pp. 215-46. 
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THE RELATIONSHIP BET WEEN HANDWRIT- 
ING PRESSURE AND LEGIBILITY OF HAND- 
WRITING IN CHILDREN AND 
ADOLESCENTS* 


THEODORE L. HARRIS and G. LAWRENCE RARICK 
University of Wisconsin 


Statement of the Problem 


THE PURPOSEOof this investigation was to ex- 
amine the relationship between handwriting legi- 
bility and point pressure inthe handwriting of boys 
and girls in grades 4, 6, and 10. Previous re- 
search at the college level had indicated that high 
legibility tends to be accompanied by relatively 
low variability in point force, while low legibility 
tends to be accompanied by relatively high varia- 
bility in point force. This study was specifically 
designed to examine the relationship between these 
two factors at younger age levels. 


The Background of the Investigation 


The present researchis one of several studies 
which are being conducted at the University of 
Wisconsin by a Committee for Research in Hand- 
writing. | The Committee works asa unit in plan- 
ning the research program. In addition, specific 
aspects of research are delegated to individual 
members of the group. Aspects of handwriting 
currently being studied by the Committee include 
studies of the characteristics of the handwriting 
product in terms of legibility and its associated 
factors, the characteristics of effective handwrit- 
ing instruments, the factors influencing choice of 
handwriting instruments, motor components in- 
volved in the handwriting process, and perceptual 
factors involved in the production and evaluation 
of handwriting. 

The present research is a direct extension of 


an investigation of the relationship. between hand- 
writing pressure and legibility at the college level, 
originally reported in 1955 as a bulletin of the 
School of Education of the University of Wisconsin, 
and subsequently reprinted in the Journal of Ex- 
perimental Education. 2 This study should be con- 


sulted for a discussion of previous research con- 
cerning handwriting pressure and for a presenta- 
tion of the technical details of the instrumentation 
used and the type of records obtained. 

The experimental equipment employed in the 
present investigation was the same as that used in 
the college study. Essentially, the equipment in- 
cluded a specially constructed writing table with a 
metal platen mounted flush with an adjustable sec- 
tion of the table top (see Figure 1). The platen 
rested on a strain gauge system which made pos- 
sible the detection of minute changes in point 
pressure during writing. The signal from the strain 
gauge system was amplified by a vacuum tube am- 
plifier and photographically recorded by a Hatha- 
way recording oscillograph. 

It became clear at an early stage in the analy- 
sis of the oscillographic records that two or more 
records having the same average word or sentence 
force might differ markedly in the constancy with 
which the force was applied to the page in writing. 
It was felt, further, that if the amount of variation 
in force could be measured in such a way as to 
make comparisons between records of varying 
speeds and levels of average force, a new way 
might be found to explore the relationship between 
handwriting force and legibility. 


Funds for the support of this study were provided in part by a grant from the Parker Pen Company, 
Janesville, Wisconsin, to the University of Wisconsin, and in part by a research grant from the Grad- 
uate School of the University of Wisconsin. Valuable assistance in this research was furnished by 
Mary Haberkorn Juaire, Research Assistant, and Henry K. Kaplan, then a graduate student in the De- 
partment of Education. ' 
. The Committee is composed of the following staff members: Paul W. Eberman, Professor of Educa- 
tion; Theodore L. Harris, Professor of Education; Virgil E. Herrick, Chairman, Professor of Edu- 
cation; and G. Lawrence Rarick, Professor of Physical Education. 


. Theodore L. Harris and G. Lawrence Rarick. ‘‘The Problem of Pressure in Handwriting,’’ Journal 
of Experimental Education, XXVI (December 1957), pp. 151-78. 
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A step toward the solution of this problem was 
made in the development of the force variation 
ratio, the derivation of which is described in the 
college study. 3 This technique of analysis makes 
it possible to meaure the cumulative amount of 
change in the application of force for each word, 
and to derive a ratio between the sum of the devi- 
ations in force from the base line of the oscillo- 
graphic record and the time taken to produce the 
record. The formula for the force variation ratio 
is: 


FVR = 


Change in mm, -2 (average height of curve) 
‘Length of base line (time) 


This ratio was used as a major technique in the 
analysis of the oscillographic pressure records 
in the college study. 

Findings of the College Study— Two types of 
findings in the original study will be mentioned 
briefly here, one rel ating to data gathered con- 
cerning force variation ratio, and the other to the 
relationship found between this ratio and legibility. 

One of the concerns of the original study was 
to investigate the significance and stability of the 
force variation ratio. Accordingly, a test of fine 
motor coordination was devised; namely, to draw 
four one-inch horizontal lines as steadily as possi- 
ble. The point force was recorded and analyzed. 
The stability of this test was indicated by the rel- 
atively high correlation of .88 between the sum of 
the force variation ratio for the first and fourth 
and for the second and thirdsamples, respective- 
ly, for 19 college subjects. The stability of the 
pressure pattern in tasks whichare similar in na- 
ture is shown by the fact that a correlation of .81 
was secured between the force variation ratios 
for the short line test and those of the normal 
handwriting samples for the same group of sub- 
jects. This suggests that the force variation ra- 
tio provides a measure of fine motor control in 
handwriting. 

The relationships among factors studied in the 
college investigation as indicated by correlation 
analysis are reported in Table I. Note that the 
most substantial correlations involve the force 
variation ratio. Average pressure, on the other 
hand, shows little relationship to any other vari- 
able except variation in force. These findings 
markedly influenced the decision to study further 
the relationship between the force variation ratio 
and legibility in a population of children and ado- 
lescents. 


Methodology of the Current Study 


Population and Design— The study was designed 
so that an analysis of variance technique might be 


3. Ibid., p. 162. 


used in the treatment of the data. The subjects 
were 144 boys and girls randomly drawn from the 
schools of two small urban communities in south- 
ern Wisconsin in such a way that 12 boys and 12 
girls were secured at eachof three grade levels— 
grades 4, 6, and 10—ineachcommunity. It should 
be pointed out thatthe population was further lim- 
ited to those children who were right-handed, who 
had had continuous experience in the school sys- 
tems concerned, and who were within the normal 
range of age expectancy for their grade. Mean 
chronological ages of the subjects by sex and grade 
are given in Table Il. In those cases in which 
more than one school in agiven community was in- 
volved, cases were drawn on a proportional basis 
according to the number of childrenper grade per 
school. It was felt that this procedure in the se- 
lection of the experimental population would make 
possible a more systematic approach to the study 
of the relationship between the force variation ra- 
tio and legibility as it might be affected by the fac- 
tors of sex and community. 

The Experimental Task—The basic data for 
this study were gathered during the fall and winter 
of the 1955-56 academic year. Since a relatively 
large population of children was included in the 
study it seemed best to move the experimental 
equipment to the two communities in which the 
data were to be gathered. The cooperating schools 
provided a room for the experimental equipment 
and also aided in securing an experienced teacher 
to assist in helping the children adjust to the ex- 
perimental situation. 

Each child was tested individually, all meas- 
ures being secured on a particular child during 
one testing period. Upon entering the testing room 
the child was seated at the handwriting table and 
necessary adjustments in chair height and writing 
angle were made. Each child was asked to write 
his name ona card. The child was then shown a 
card with the typewritten sentence, ‘‘The quick 
brown fox jumps over the lazy dog.’’ He was asked 
to study the sentence until he could write it from 
memory. Each pupil was given an opportunity to 
write the standard sentence several times until he 
learned to spell each word andto reproduce the en- 
tire sentence correctly. Next, afresh card was 
placed over the platenin preparation for recording 
point pressure in writing the standard sentence. 

After determining that the pupil could repeat the 
standard sentence from memory, he was then 
asked to write the sentenceonthecard. Care was 
taken to instruct each child to write in his usual 
way. At this time his handwriting pressure was 
recorded. After completing the reproduction of 
the standard handwriting sample, a second card 
was placed over the platen (see Figure 2). The 
Subject was asked to complete the figure with 
either curved or straight lines as indicated on the 


= 
4 
4 
“8 
: 
+ 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


RANK-ORDER CORRELATIONS BETWEEN LEGIBILITY TIME, 
AVERAGE PRESSURE, AND VARIATION IN FORCE FOR 
57 NORMAL, SLOW, AND FAST ADULT SAMPLES 


Variables Rho 


Variation in force® and pressure? . 69 
Variation in force and time© . 67 
Variation in force and legibility? . 59 
Legibility and time . 54 
Pressure and time 


Pressure and legibility -O1 


4Ranked from low to high 
DRanked from low to high 


CRanked from slowest to fastest 
Ranked from best to poorest 


TABLE II 


MEAN CHRONOLOGICAL AGES OF SUBJECTS BY GRADE AND SEX 


Boys Girls 


Mean Mean 
Grade (Months) oe (Months) 


Grade 4 116. 38 . 115.75 
Grade 6 143. 88 ° 143.50 


Grade 10 185.13 . 186. 08 
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FIGURE 2: 


FORMS TO BE COMPLETED 
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IN THE 


TEST OF FINE MOTOR CONTROL 


card. This test was designed to measure the pu- 
pil’s finer motor control in a situation similar to 
that of handwriting. 

Methods of Analysis of Data—In the present 
study, force variation ratios were calculated by 
the techniques used in the college study. How- 
ever, the calculation of the force variation ratio 
for an entire sentence is a time-consuming, la- 
borious job. Accordingly, astudy was made of 
the consistency of the force variation ratio for 
adults and for a randomizedsample of subjects in 
grades 4,6, and 10 by correlating the force vari- 
ation ratios for each word with the average force 
variation ratio forthe entiresentence. These da- 
ta are presented in Table III. The consistently 
high correlation between the force variation ratios 
for individual words and the average force varia- 
tion ratio for the entire sentence may be noted. 
Since the force variation ratio for the words ‘‘fox’’ 
and ‘‘jumps’’ showed consistently the highest cor- 
relations, the sum of the force variation ratios 
for the phrase ‘‘fox jumps’’ only was used. In 
other words, the children wrote the complete sen- 
tence but an analysis of the pressure records of 
these two words only was made. As might be in- 
ferred from the data in Table II], the time re- 
quired to write the phrase ‘‘fox jumps’’ correlated 
highly with the time required to write the en- 
tire sentence. Table IV demonstrates that the 
time for writing a short phrase is representative 
of the tempo of writing for the entire sentence. 

Evaluations of the legibility of the handwriting 
samples produced by children and adolescents in 
this study were made by rating the legibility of the 
total sentence on an eleven point experimental 
scale composed of reproductions of the standard 
sentence, ‘‘The quick brown fox jumps over the 
lazy dog.’’ This experimental scale was a com- 
posite scale made up of samples drawn from pre- 
viously developed handwriting legibility scales 
prepared for grades 4, 6, and 10, respectively. 
The legibility ratings were done by five exper- 
ienced judges who followed standard procedures 


in determiniug each child’s legibility rating. Each 
rater displayed a relatively high degree of consis- 
tency in his own ratings. This is shown in Table 

A generally high level of consistency among 
judges was also established. Median scale values 
for the ratings of the five judges were accepted as 
the best estimate of the legibility of the sample. In 
122 cases out of 144 samples, the following cri- 
teria were met: 1) four or five ofthe ratings were 
identical with or contiguous to the median value 
(in 116 cases), and 2) acentral tendency was 
clearly indicated (in six cases). The remaining 
22 samples representing split distributions were 
then considered individually. For example, if one 
sample varied markedly from the majority of the 
ratings, it was re-rated. If the new rating did not 
move toward the medianvalue, the atypical rating 
was discarded and the most typical retained. 

Comparisons with the College Study—A sum- 
mary of certain aspects of the methodology of the 
handwriting study of children and adolescents and 
that of the college study is given in Table VI. 

It will be noted that the current study does not 
constitute a direct replication of the college study. 
In addition to obvious differences in the size and 
level of the population, other important specific 
differences should be noted: 


1. The current study represents a randomized 
sample within purposely c ontrolled factors 
of community, grade, and sex, while the col- 
lege sample was stratified to identify the 
best and poorest writers in a given college 
population. 


2. A single sample of handwriting was utilized 
in the current study, whereas subjects wrote 
anormal, a slow, and a fast sample in the 
college study. 


3. In the current study the handwriting samples 
were rated on an 11-point scale, while inthe 
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TABLE I 


CORRELATIONS OF THE AVERAGE FORCE VARIATION RATIO FOR THE EN- 
TIRE SENTENCE WITH THE FORCE VARIATION RATIOS OF INDIVIDUAL 
WORDS FOR 30 SUBJECTS RANDOMLY DRAWN FROM GRADES 4, 6, 
AND 10 AND FOR 19 ADULT COLLEGE SUBJEC TS 


4th Grade 6th Grade 10th Grade 
Word (N=10) (N=10) (N=10) 


quick . 856 . 818 . 723 
brown . 809 . 824 . 847 
fox . 924 . 884 - 918 
jumps . 908 . 897 . 944 
over . 883 - 863 -914 
the . 780 . 719 . 957 
lazy . 819 . 827 . 800 


dog . 804 927 


jumps’’ .959 . 856 . 946 


TABLE IV 


CORRELATIONS BETWEEN WRITING TIME FOR THE PHRASE ‘‘ FOX JUMPS’’ 
AND THE TOTAL SENTENCE WRITING TIME FOR 30 SUBJECTS RAN- 
DOMLY DRAWN FROM GRADES 4, 6, AND 10 


N 


10 


10 


10 


+4 : 
(N=19) 
. 945 = 
940 
. 890 
966 
Grade 
4 
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TABLE V 


CONSISTENCY CHECKS ON HANDWRITING LEGIBILITY RATINGS OF 
FOUR JUDGES* BASED ON 12 SAMPLES RANDOMLY DRAWN 
FROM A TOTAL OF 144SAMPLES 


Deviation in Scale Steps from Original Rating 


None 1 2 3 


Rater 1 
Rater 2 
Rater 3 


Rater 4 


*The 5th judge was unavailable when the consistency check was made. 


TABLE VI 


A COMPARISON OF THE METHODOLOGY USED IN THE HANDWRITING STUDY OF CHILDREN 
AND ADOLESCENTS AND THAT OF THE COLLEGE STUDY 


Current Study of Children Previous Study of 
and Adolescents College Students 


144 subjects in grades . 19 subjects in junior 
4, 6, and 10 year of college 
Description 


72 boys and 72 girls - 9 men and 10 women 
of 


A randomized sample of . Anon-random sample 
Sample 12 boys and 12 girls at of the best and poorest 

each of three grade lev- writers in a university 

els in two communities class of 231 students 


Number and type One sample (normal) Three samples (normal, 
of handwriting per subject slow, and fast) per 
samples subject 


Rating of normal samples on Single rank-ordering of 
Method of an 11-point composite scale. 57 samples (three sam- 
rating The scale was made up of ples per individual) of 
legibility samples drawn from Grades normal, slow, and fast 
4, 6, and 10. writing 
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college study samples were placed in rank- 
order of legibility. In both cases a satis - 
factory degree of consistency was obtained. 


Differences in the methodology of the two stud- 
ies produced certain differences in the data and 
likewise placed certainlimitations upon compari- 
sons of the data in the two studies. It seems 
clear that the random sampling technique in the 
current study tended to maximize the chances of 
a more normal distribution of handwriting legi- 
bility than was obtained in the college study. The 
use of a single sample of handwriting produced at 
a normal rate inthe current study probably result- 
ed in less variance in legibility than would have 
occurred had the samples been written at differ- 
ent speeds as in the college study. 

Since the studies likewise differed markedly in 
the method of securing ratings of legibility, one 
using a handwriting scale and the other a rank- 
ordering technique, the college samples were lat- 
er re-evaluated for legibility upon a scale and the 
relationships between legibility andthe force var- 
iation ratio re-examined. Thesefindings are re- 
ported in the next section ofthis paper, as well as 
the detailed findings of the relationship between 
handwriting pressure and legibility of handwriting 
in children and adolescents. 


Results of the Study 


While the major emphasis of the present in- 
vestigation was upon the determination of the re- 
lationship between the two factors, variation in 
point pressure and handwriting legibility, data 
were also obtained on writing rate and point pres- 
sure for boys and for girls at each of the three 
grade levels. In interpreting the results of the 
present investigation, reference will also be made 
to the findings of the previous college study, and 
to a follow-up study onasmall sample of elemen- 
tary school children in which the rate of writing 
was systematically varied. 

Initial Steps in Treating the Data—It will be 
recalled that the subjects for the present investi- 
gation were randomly drawn from grades 4, 6, 
and 10 in the schools of twosmall urban commun- 
ities. Data on handwriting legibility and varia- 
bility in point pressure constituted the primary 
sources of data on the 144 children in these two 
communities. An analysis of variance design was 
employed to test the hypotheses that no true grade, 
sex, or community differences existed for either 
legibility or variability in point pressure. 

The legibility variable was selected as the ini- 
tial item for analysis. Inselecting a level of con- 
fidence which seemed appropriate to the data and 


to the measuring techniques employed, the inves- 
tigators set the one percent level. The results of 
the analysis of variance are given in Table VII. It 
will be noted that the F ratios of 43.18 and 19.08 
for sex and grade respectively are significant be- 
yond the .01 level, whereas no community differ- 
ences in legibility were obtained. Hence, the null 
hypotheses were rejected for sex and grade, and 
the legibility data for the two communities were 
combined to determine what grade and sex differ- 
ences in legibility did exist. 

Sex and Grade Differences in Handwriting Leg- 
ibility—Grade and sex trends in legibility includ- 
ing those for the college sample are shown in Fig- 
ure 3. It should be noted that the slope of the leg- 
ibility curve for the girls shows substantial im- 
provement in legibility from grade 4 through grade 
10, whereas such is the case forthe boys. In fact 
neither the 10th grade boys nor the sample of col- 
lege males achieved the mean legibility standard 
of the 6th grade girls. 

Means and standard deviations for handwriting 
legibility are given in Table VIII based on the 11- 
point scale. While the means of the girls are su- 
perior to those of the boys at each grade level, the 
standard deviations are sufficiently large to as- 
sure considerable overlapping of the distributions 
at grades 4 and 6. In order to determine whether 
the observed differences in means by grade and 
sex were true or chance differences, the Duncan 
Range Test4 was applied to the data for grades 
4 through 10. 

The results of this test showed that there were 
no significant differences in the mean legibility 
ratings between boys of different grades. Nor was 
the average legibility rating of fourth grade girls 
significantly different from the mean legibility rat- 
ings of boys in grades 4, 6, and 10. However, the 
test disclosed that the mean legibility of the sixth 
grade girls and the tenth grade girls differed sig- 
nificantly (p = .01) from each other and from each 
and all other mean legibility ratings for both sexes. 
In other words, the handwriting legibility of the 
tenth grade girls was superior to that of the sixth 
grade girls, and both the sixth and tenth grade 
girls were superior in legibility to the boys and 
girls at all other grade levels. 

Difference in Force Var iation Ratios for Sex, 
Grade, and Community— The data on variability in 
point pressure were analyzed in the same way as 
those for legibility. An analysis of variance was 
used to test the hypothesis that there were no true 
grade, sex, or community differences in varia- 
bility in the application of point pressure. Table 
IX gives a summary of this analysis. As will be 
noted, a difference significant at the one percent 
level was found for grade only. However, sex and 


4. David B. Duncan. ‘‘Multiple Range and Multiple F Tests,’’ Biometrics, XI (March 1955), pp. 1-9. 
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FIGURE 3: MEAN LEGIBILITY RATING 
BY SEX AND GRADE 
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TABLE VII 


SUMMARY OF ANALYSIS OF VARIANCE FOR HANDWRITING LEGIBILITY 


Source of Sum of Mean 
Variation Squares Square F 


Sex 115. 5624 115. 5624 43. 1831* 


Community .1736 .1736 


Grade . 0972 51. 0486 19. 0757* 
SxC - 5625 1. 5625 
SxG . 2914 31. 6457 11. 8253* 
CxG . 3472 2 1. 6736 
SxCxG . 5418 2 2.7709 
Within Groups . 2503 (12)(11)=132 2. 6761 

Total . 8264 143 


* Significant beyond the 1 percent level 


TABLE VIII 


MEANS AND STANDARD DEVIATIONS OF HANDWRITING LEGIBILITY 
SCORES BY GRADE AND SEX 
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Bal. 4 Boys 7.17 . 90 Pais 
Girls 6. 83 1.73 
Pal 6 Boys 6.71 1. 46 
Girls 5.21 2.10 
Girls 3.17 1. 86 
12 M 6. 22 2 
Women 3.10 1. 92 
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75 
TABLE Ix 
SUMMARY OF ANALYSIS OF VARIANCE FOR FORCE VARIATION RATIO DATA 
Source of Sum of Mean 
Variation Squares df Square F 
Sex 23. 2900 1 23. 2900 4. 6939* 
Community 23. 3900 1 23.3900 4.7140* 
Grade 147. 0966 2 73. 5483 14. 8229** 
sxc 14. 8225 1 14.8225 2.9873 
SxG 2. 6689 2 1.3345 
CxG 8. 6123 2 4. 3062 . 8679 
SxCxG 3. 9631 2 1.9816 
Within Groups 655. 9510 (12)(11)=132 4.9618 
Total 878. 7944 143 
* Significant at 5 percent level 
**Significant at 1 percent level 
TABLE X 


MEANS OF FORCE VARIATION RATIO BY GRADE AND SEX 


Grade FVR 

Level Sex Mean sD 
4 Boys 6. 32 1. 62 
Girls 6. 84 2. 20 
6 Boys 6.49 1. 52 
Girls 7.44 2.20 
10 Boys 8.47 2.21 
Girls 9.23 2.85 
15 Men 12. 56 4.07 
Women 2.93 
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community differences were significant at the five 
percent level. So that the analysis of the data 
might parallel that for the legibility data, the re- 
sults for the two com munities were combined in 
further examining differences in sex and grade. 

The mean force variation ratios by grade and 
sex are presented in Figure 4. It will be noted 
that with advancing grade level, the force varia- 
tion ratio increases for both sexes through grade 
10. The upward trend continues for college wo- 
men. 

Group data for the force variation ratios are 
given in Table X. Although the college scores 
were not included inthe analysis of variance, mean 
scores for these subjects are alsopresented. The 
Duncan Range Test was applied to the force vari- 
ation ratio data for grades 4 through 10. Signifi- 
cant differences were found between tenth grade 
girls and fourth and sixth grade boys and girls 
(p= .01), and between tenth grade boys and fourth 
and sixth grade boys (p= .01) and fourth grade 
girls (p= .05). While these grade differences are 
noteworthy, no true sex differences were obtained. 

If Figures 3 and 4 are compared, it may be 
seen that the means in force variation ratio with 
advancing grade level are paralleled by an increase 
in handwriting legibility. Hence the relationship 
noted in the college study that highly legible hand- 
writing is associated with a low force variation 
ratio is not supported by the data obtained on these 
children. In other words, with advancing age and 
improved hand writing legibility, one would have 
expected a decline rather than an increase in the 
force variation ratio. The data show, further- 
more, that girls have a slightly higher mean force 
variation ratio at each succeeding grade level 
through grade ten than the boys while producing 
the more legible handwriting. At the college lev- 
el, however, this trend is reversed, the women 
producing highly legible handwriting with low var- 
iability in point pressure. The men, on the other 
hand, continue the upward trend in variability in 
point pressure with a negligible increase in legi- 
bility. The reason for the contradictory findings 
of the two studies will become evident in a later 
section of this report. 

Grade and Sex Trends in Writing Time—It is 
a common observation that young children write 
more slowly than older children, although there 
is considerable variability in writing rates among 
individuals of two given ages. The mean time for 
writing the standard sentence is plotted by grade 
and sex in Figure 5. It will be noted that the sex 
differences in rate of writing were not great al- 
though the mean writing time was less at succes- 
sive grade levels. In fact the rate of writing 
more than doubled between grades 4 and 10. The 
most rapid gains in writing speed occurred be- 
tween grades 4 and 6, withslightly less exagger- 
ated gains between grades 6 and 10 and only a 
relatively slight increase in writing speed be- 


tween grades 10 and 15. 

Mean values for writing time by grade and sex 
are shown in Table XI. These data show that the 
mean writing time declinedata relatively constant 
rate with advancing grade level for both sexes. In 
general, the standard deviations indicate wide var- 
iability in writing times except for males in grade 
10 and in the college sample. The Duncan Range 
Test was applied tothe data for mean writing time 
for grades 4, 6 and 10. No significant sex differ- 
ences within each of grades 4, 6 and 10 were found 
for mean writing time. However, the perform- 
ance of fourth grade boys was significantly differ- 
ent that that of sixth grade boys (p = .05), sixth 
grade girls (p= .01), and tenth grade boys and 
girls (p= .01). The performance of fourth grade 
girls was also significantly different from that of 
sixth grade girls (p = .05), and from that of tenth 
grade boys and girls (p= .01). The mean writing 
time of the sixth grade boys was also significantly 
different from that of tenth grade boys (p= .05). 
These data clearly indicate that a substantial in- 
crease in speed of handwriting may be expected at 
successive grade levels, and that speed in hand- 
writing appears to be afunctionof age rather than 
of sex. 

Point Pressure Changes with Advancing Grade 
Level— During the initial states of writing there 
is a tendency for children to exert considerable 
point pressure. However, aschildren gain exper- 
ience in writing general observation would lead 
one to believe that point pressure declines with 
age. To examine this observation, point pressure 
data were obtained from the oscillographic pres- 
sure records and are presented in Figure 6. The 
average force per second exerted by the point on 
the platen for the words ‘‘fox jumps’’ constituted 
the unit of measurement. It will be noted that with 
advancing grade level, average word force tended 
to decline through grade 10. However, sex differ- 
ences in point pressure in grades 4 through 10 
were slight. The college data for point pressure 
are not directly comparable and are therefore 
omitted. 

Table XII shows that the variability among sub- 
jects in point pressure remained high at grades 4 
and 6 even though the average writing pressure 
tended to decline. Only at the tenth grade were 
the standard deviations for writing pressure suffi - 
ciently low to indicate some uniformity among sub- 
jects in the application of point pressure. When 
the Duncan Range Test was applied to the data for 
point pressure, no significant sex differences 
were found within grades in mean point pressure. 
Significant difference (p = .05) were found for both 
sexes between grades 4and 10only. Thus it would 
appear that succesive grade changes for both 
sexes in mean point pressure are not as great as 
those in speed of writing. 

Correlation Analysis—In order to determine 
the relationship between the force variation ratio 
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TABLE XI 


AVERAGE TIME REQUIRED TO PRODUCE THE STANDARD SENTENCE 
BY GRADE AND SEX 


Mean Time 
(in seconds) 


119. 53 
102. 62 


79.92 
65. 64 


10 Boys 24 42.02 6. 81 
46.85 


College Men 9 32.12 2.18 
39.47 


TABLE XII 


MEANS FOR POINT PRESSURE DATA BY GRADE AND SEX 


Mean Pressure 
(Av. height in mm.) 


43.72 
41.69 


30. 80 
29.89 


24. 46 
24. 88 


al. 
. 
79 
Grade 
Level Sex N 
4 Boys 24 39. 91 
poe Girls 24 40. 06 
6 Boys 24 | 28. 97 
Girls 24 14. 74 
Grade 
Level Sex N SD 
4 Boys 24 26.16 
Girls 24 20. 96 
6 Boys 24 10. 61 
Girls 24 20. 25 
10 Boys 24 7.94 
a Girls 24 7.29 
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and legibility, Pearson product-moment correla- 
tions were run between these two variables at the 
4th, 6th, and 10th grade levels. These correla- 
tions are given in Table XIII. In order to inter- 
pret the apparent discrepancy between the data 
for the college population and the data for the pres- 
ent study, thecorrelations between the force var- 
iation ratio and legibility of the college population 
are also included. 

It will be noted that all correlations between 
the force variation ratio and legibility ratings of 
the scaled samples for the 4th, 6th, and 10th 
grade pupils are low. It is evident, therefore, 
that under the conditions of this investigation, leg- 
ibility and the force variation ratio appear to be 
unrelated. On the other hand, all correlations 
for the college subjects are positive and at least 
four are moderately high. Thus, whenall 57 sam- 
ples (slow, normal, and fast) for the 19 college 
subjects were correlated with the force variation 
ratio, moderately high correlations were obtained. 
The inclusion of the three samples at different 
speeds for each subject tended to broaden the 
range of legibility as well as that of the force var- 
iation ratio and hence tended to maximize the de- 
gree of correlation. On the other hand, when the 
analysis was restricted to only 19 normal sam- 
ples, the effect of writing speed was minimized. 
Hence, the range of scores for legibility and for 
the force variation ratio was markedly reduced. 
This may, in part, explain the low correlations 
between these variables. It should also be noted 
that in those instances in which the samples were 
rated against the handwriting scales, lower coef- 
ficients of correlation were obtained than when the 
samples were ranked from most legible to least 
legible. In the present study, handwriting scales 
were used for all legibility rankings. It is be- 
lieved that the use of scales tended to group the 
scores near the center of the distribution and 
yielded lower correlations than would have been 
found had the samples been placed in a single rank- 
order. This is illustrated by the low correlation 
of .18 found between the force variation ratio and 
legibility when the normal samples only for the 
college population were rated for legibility on the 
composite scale. This correlation for normal 
handwriting samples is comparable to that ob- 
tained for the childrenand adolescents. It is pos- 
sible that had samples of normal, fast, and slow 
writing for children and adolescents been gathered 
and placed in rank order of legibility, correlations 
of a magnitude similar to those obtained for the 
mixed writing samples of the college population 
might have resulted. 

Correlations Between Motor Control and Legi- 


bility—It will be recalled that upon completion of 
the standard sentence each pupil was given a bat- 
tery of eight motor control tests (see Figure 2) 
which involved movements of the hand and fingers 
similar to but not identical to those used in hand- 
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writing. Pressure records were secured.-on all 
motor control items. Intercorrelations among the 
samples of force variation ratios for the eight 
items were consistently high, ranging from . 609 
to .948. The two items which correlated most 
highly with each other and with all others were the 
motor control items (0 and (0. As may be seen in 
Table XIV, the correlations between the force var- 
iation ratios (FVR’s) for these two items ranged 
from .705 to .719. Since these two test items in- 
volve rather basic movements in letter formation 
and since the resulting FVR’s are moderately 
highly correlated, itwas decided to use the sum 
of the FVR’s as the measure of motor control. 

In order to examine further the stability of the 
force variation ratios, the variability in point 
pressure of the words ‘‘fox jumps’’ was correlat- 
ed with the sum of the FVR’s of the two motor con- 
trol items. These correlations range from . 353 
to .539. It is interesting to note that while these 
correlations are not high, they are all positive and 
two of the three are significant at the 1 percent 
level and the other is significant at the 5 percent 
level. 

The consistently positive correlations among 
these data suggest that these may be indicative of 
a definite motor function common to both tasks. 
The fact that these combinations are not as high 
as the correlation of .81 attained in a similar set 
of tasks in the college study may mean that a sta- 
bilized motor pattern for handwriting is not 
achieved until college age. 

The correlations between legibility and motor 
control shown in Table XIVare negligible. These 
findings are in agreement with the correlations 
reported in Table XIII between legibility and the 
force variation ratio. 

Legibility and Force Variation Ratioas a Func- 
tion of Writing Speed—In order to study the effects 
which variations inthe individual’s rate of writing 
might have upon the relationship between legibility 
and variability in point pressure, data were ob- 
tained on another sample of 20 fourth grade chil- 
dren. These children wrote the standard sample 
at their usual rate of writing, ata slower rate than 
normal, and a rate which was substantially faster 
than normal. The results obtained for these 20 
children are shown in Table XV. In arriving at 
the figures included in Table XV, an index value 
of 1.00 was taken as the average legibility and av- 
erage force variation values at the normal rate of 
writing. Note that the values listed under the head- 
ings Slow and Fast are percentages of the mean 
values for the normal writing rate. For example, 
an index value of less than 1.00 indicates legibili- 
ty better than that produced at normal writing 
rates, and a value higher than 1.00 is indicative 
of legibility poorer than that achieved at normal 
writing rates. Conversely, a variability index of 
less than 1.00 indicates relatively low force vari- 
ability, whereas an index greater than 1.00 is 
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TABLE XIV 


INTERCORRELATIONS OF MOTOR CONTROL DATA AND CORRELATIONS OF 
LEGIBILITY RATINGS WITH MOTOR CONTROL 


4th Grade 6th Grade 10th Grade 

N= 48 N = 46 N = 47 
FVR of (0 with FVR of (0 705* 707* 
= (0 t0 with Fox Jumps . 409* . 539* . 353** 
(0 with Legibility Rating . 109 . 009 
* Significant at the 1 percent level 
**Significant at the 5 percent level 

« 
TABLE XV 


LEGIBILITY AND FORCE VARIATION RATIO AS FUNC TIONS OF WRITING SPEED AT THE FOURTH 
GRADE LEVEL, USING THE AVERAGE FORCE VARIATION RATIOS OF THE NORMAL 
SAMPLES AS THE REFERENT 


Slow Normal Fast 
Percent of Normal Percent of Normal 
Adult | Adult 1.55 
Legibility Child .81 high 1.00 Child 1.29 low 


Variability in Adult .80 Adult 1.52 |. 
Force .94 low 1.00 1.62 high 
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evidence of higher force variation ratio than at 
the normal writing rate. Data from the college 
population were handled in the same way, using 
the averages of the samples produced at normal 
writing rates as the basis for computing indices 
for legibility and variability at slow and fast writ- 
ing rates. It is interesting to note that complete 
agreement was obtained between the child 
and adult samples for both legibility and var- 
iability when the subject movedfrom the norm- 
al rate of writing toeither the fast or the slow 
rate. 


It is evident that when the legibility and the 
force variation ratio scores were examined as a 
percentage of the scores obtained at normal writ - 
ing rates, high legibility was obtained on the av- 
erage at slow writing rates and low legibility oc- 


curred when the writing rates were rapid. Like- 


wise, when the writing rate was slow, variability 
in force for both groups of subjects was low and 
legibility was high; when the speed of writing was 
fast, variability in force was high and legibility 
was low. This lends support tothe earlier obser- 
vation about the college population that rapid writ- 
ing tends to produce poor legibility and high vari- 
ability in the application of force, whereas slow 
handwriting is associated with high legibility and 
low variability in application of force. It would 
appear, therefore, that the force variation ratio 
as such does not discriminate between the poor 
handwriter and the highly legible one at normal 
writing rates, but the evidence suggests that as a 
person moves away from his normal rate of writ- 
ing, legibility and variability in force are re- 
lated. 


Conclusions 


This study has tested the hypothesis that legi- 
bility and force variation ratio are not signif- 
icantly related in a child and adolescent popula- 
tion. The evidence presented in this study sup- 
ported this hypothesis under conditions in which 
the children were asked to write at their usual 
rate of writing. While sex and grade differences 
in legibility were evident, these differences could 
not be attributed to differences in variability in 
point pressure. However, the data showed that 
when either children or adults moved away from 
their normal writing tempo, high legibility tended 
to be associated with low variability in application 
of force and poor legibility was associated with 
high variability in point pressure. In this sense, 
the original hypothesis was not supported for 
either children or adults. 

The findings suggest that each individual has 
his own pattern for speed of writing and for vari- 
ability in the application of force. These might 
be considered to be rather basic components of a 
motor set in handwriting. If speed of handwriting 
is increased, variability in application of force is 
likewise increased, the motor set is disturbed and 
handwriting legibility is adversely affected. 

The Committee in Handwriting is now broaden- 
ing the scope of its investigation to include other 
motor and perceptual aspects of the handwriting 
process. It is felt that the present study has pro- 
vided additional evidence of the importance of fo- 
cusing attention upon the individual and the stability 
of his performance under varyingconditions. This 
wouldlikewise appear to be a worthwhile focus for 
studying learning problems in other basic skills. 
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GROUP PROBLEM SOLVING SKILLS IN 

ELEMENTARY SCHOOL CHILDREN 
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Introduction 


OVER THE past decade the development of new 
techniques of measurement in the general area of 
group dynamics has lagged behind the ever- 
increasing emphasis on researchinthis field. Al- 
though a few investigators have devised special 
methods for measuring some of the important 
elements of group interaction (1, 5), most re- 
searchers have, like Lippett and White, invented 
their own unique ‘‘test situations’’ and then em- 
ployed the more familiar method of verbatim re- 
cording followed by detailed analysis. (4). This 
paper describes an attempt which has been made 
to devise a ‘‘standard situation’’ in which defined 
elements of group interaction can be systemati- 
cally observed, recorded, and scored. 

The test is an outgrowth of the work done by 
the Mid-Century Committee on Outcomes in Ele- 
mentary Education (2), and as such is geared to 
the measurement of social relations skills among 
elementary school children. It is believed, how- 
ever, that the technique possesses wide applica - 
bility in areas other than the one for which it was 
originally devised, since it has been used suc- 
cessfully with teen-age as well as with adult 
groups in avariety of special fields. But, as de - 
scribed in the following pages, the test is pre- 
sented as a possible research tool which might 
provide data to answer, at leastin part, many of 
the important methodol ogical questions which 
present-day educational techniques and practices 
raise. 


The Test and its Administration 


The Russell Sage Social Relations Test is a 
situational test created to assess the nature and 


quality of two important aspects of elementary 
school children’s skill insocial relations: (1) skill 
in cooperative group planning procedures, and (2) 
skill in techniques of cooperative group action. It 
is designed for children in grades three through 
six and occupies approximately one hour of time. 
Its administration requires a trained examiner 
and a trained observer. . 

The test consists of three construction-type 
problems, graded in difficulty and administered 
one after the other. For each problem the chil- 
dren are provided with thirty-six interlocking con- 
struction blocks of various shapes and colors and 
a model which they are tocopy exactly. All thirty- 
six blocks are necessary for the construction of 
each model, there being none left over when the 
problem is completed. As shown in Figure 1, the 
first problem requires the children to build the 
figure of a house; the second, a simple footbridge; 
and the third, the figure of a dog. 

The test is administered to an entire class - 
room group of children at once. In the course of 
the administration, the examiner tells the children 
that the purpose of the test is to see how well they 
can work together. Hetellsthem that the test does 
not provide scores for individual pupils, but only 
one score for the class as a whole. Pupils may, 
therefore, help each other, discuss things freely, 
and work cooperatively on the problem. He pro- 
vides eachchildinthe room with one or two of the 
blocks needed to build the particular model shown 
them. He then tells the group that, although they 
can have all the time they need to plan how they 
are going to go about it, they will be allowed only 
fifteen minutes for the actual construction. Final- 
ly, the examiner informs the group that their 


score on the test depends upon the time it takes 


* The development of this instrument was made possible by a grant from the Russell Sage Foundation. 


**The writer is indebted to Dr. William E. Coffman and Dr. William A. Jenkins for their fundamental 
contributions to the conceptual and theoretical framework of the test, as well as to its overall design. 
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Problem 1 — The house 


Problem 2 — The footbridge 


red 
blue 
white 


Problem 3 — The dog 


FIGURE 1 


THE TEST PROBLEMS 
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them to build the model, the shorter the time the 
higher their score. 

Having explained the test and its rules to the 
class, the examiner allows the pupils to proceed 
with the task of developing a planof action. When 
this plan is completed and all members of the 
group have it clearly in mind, he tells them to be- 
gin construction of the first model and immediate- 
ly starts timing. Hestops his watch when the fin- 
al block is correctly placed and announces the 
score to the class. He then proceeds with the 
second and more difficult problem. The class 
is again given time to plan and organize itself as 
in the first problem. The third and most difficult 
problem is administered in like manner. 

As mentioned above, the test is designed to 
provide information about two different aspects of 
social relations skill. The first is the ability to 
participate in group discussions of plans for fu- 
ture group action; the secondis the ability to reg- 
ulate one’s behavior in accordance with these 
plans so that movement toward the agreed-upon 
group goal is facilitated. The test is thus divid- 
ed into two separate parts. The first part is 
termed the Planning Stage—the period during 
which the children devise a plan of action. The 
second part is called the Operations Stage—the 
period during which the group puts its plan into ef- 
fect and actually builds the model, using the 
blocks provided by the examiner. 

In both stages the aim is to obtain a measure 
of the best possible performance the group is cap- 
able of rendering. This aim is accomplished by 
having the examiner play a precisely defined and 
standardized role inhis administration. His ini- 
tial task is to explain the test and its rules to the 
children. In fulfilling this task his actions and 
statements are the same for all groups. His sec- 
ond and more complicated assignment is to create 
and maintain an atmosphere in which optimum 
group planning is possible. To achieve this pur- 
pose he must vary his behavior in accordance 
with the needs of the group. That is, a noisy 
group must be held in check so that communica- 
tion and discussion of ideas is possible; a quiet 
solemn group must be encouraged to express ideas; 
a group that getslost in the complexity of its own 
ideas must be rescued and its thinking clarified. 
Throughout all this the examiner must refrain 
from providing the children with his own ideas of 
ways of solving the problem and from telling them 
which of the ideas they have suggested are good 
or poor. He mustatall times maintaina permis- 
sive atmosphere in which the giving of ideas is 
encouraged and all suggestions warmly accepted, 
but in which no positive direction is supplied. In 
such an atmosphere the children can perform 
only at the level at which they themselves are 
presently capable, and the quality of their plan- 
ning thus provides a measure of the amount of 
skill which they possess. 
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The examiner’s final task is toinsure that, be- 
fore entering the Operations Stage, each child in 
the group clearly understands the plan which has 
been devised. Therefore, he repeats verbatim the 
details of the plan which the class has built and 
asks if there are further questions before they be- 
gin the construction task. Here againthe examin- 
er must vary his behavior with the needs of the 
group, answering questions on points which the 
group has already decided, and referring back to 
the group any questions about points which they 
have not taken into account. It is not until the chil- 
dren tell the examiner that they are all ready to 
begin work that he permits them to start. This 
insures that all groups enter the Operations Stage 
with presumably equal clarity concerning the plan 
that they as a group have built. 

During the Operations Stage the role of the ex- 
aminer is also standardized. After the signal to 
start is given he withdraws to one side of the class- 
room and refuses to answer questions or provide 
help, even when requested. Regardless of what 
the childrendo (however noisy, rough, boisterous, 
or quarrelsome they become, or however quietly 
and efficiently they work), he neither reprimands 
nor commends them. He interferes only if behav- 
ior gets so out-of-hand that physical harm to chil- 
dren is apt to result. Ifthis occurs, he stops the 
test and no further problems are given. 

The mechanics of administering the test place 
the children in a situation where progress toward 
the goal is not possible without some type of coop- 
eration on the part of each child in the group and 
where external controls designed to elicit particu- 
lar kinds of behaviors and suppress other kinds do 
not exist. In such a situation the children can 
show the extent to which they are capable of carry- 
ing out a plan of action they themselves have de- 
vised, and the kind of behavior they exhibit pro- 
vides a measure of the amount of skill they pos- 
sess in this respect. 

During the time the childrenare engaged in the 
task of planning how to solve the problems and in 
working on the actual construction of the various 
models, the observer (seated near the back of the 
classroom) keeps a record oftheir behavior using 
standardized obser vation sheets. The data thus 
collected are later transformed into a set of nu- 
merical scores which are used to rank groups on 
the basis of their skill in cooperative group plan- 
ning procedures and in techniques of c ooperative 
group action. (A manual containing complete di- 
rections for administering and scor ing the test 
can be obtained from ETS.) 


Empirical Development of Scoring Procedures 


The methods of scoring which are discussed in 
a subsequent section are the result of two years’ 
research. Table I shows the large-scale adminis- 
trations that have been completed, but fails to il- 
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TABLE I 


EXPERIMENTAL ADMINISTRATIONS OF THE RUSSELL SAGE 


SOCIAL RELATIONS TEST 


Date 


No. of Parti- 
cipating Schools 


November 1953 


February 1954 


April 1954 


November 1954 


May 1955 


One 


One 


(retest) 


Fifteen 


Twenty-one 


Twenty-one 
(retest) 


— 
88 
ies" 
No. of 
Fifth 4 
Fifth 4 
Fifth 10 
Sixth 10 
Fourth 13 
Fourth 13 
Fifth 9 


lustrate the fact that after each administration 
the observation, recording, and scoring tech- 
niques employed underwent drastic revisions. 
Each revision represented yet another attempt to 

break away from the familiar framework of scor - 

ing individuals and to devise a reliable and mean- 

ingful system for scoring groups. 

Such attempts proved to be unusually difficult 
since all previous research in this general area 
has employed techniques which describe segments 
of individual behavior, either by means of a check 
list such as that devised by Bales (1) or by means 
of verbatim recordings of individuals’comments, 
attitudes, actions, etc., as employed by Lippitt 
and White (4). All attempts to apply such tech- 
niques proved unsatisfactory from the start. In 
the first place, the groups dealt with were so 
large that in focusing upon the behavior of specif- 
ic individuals the interaction among individuals 
was lost sight of. Secondly, the varieties of in- 
teraction which occurred indifferent groups were 
so complex that individual recordings failed to 
catch muchoftheimportantinformation. Finally, 
a primary aim was todevelopaninstrument which 
teachers themselves could use with their classes; 
hence complex techniques which required an un- 
due amount of time, effort, special training, and 
expense could not be used. 

Further search of the literature revealed that 
most research on groups of the size in which we 
were interested used sociometric ratings and 
specially designed questionnaires. The few spec- 
ial techniques that had been devised were similar 
to Pepinsky’s Group Participation Scale (5), a 
modified ‘‘guess-who’’ instrument which not only 
depended heavily upon the intelligent cooperation 
of the subjects but produced a limited picture of 
only one aspect of group interaction. 

Two important guideposts determined the gen - 
eral direction which the development of observa- 
tion, recording, and scoring techniques for the 
present test should take. The first of these was 
the definition of the ‘‘good’’ group implicit in the 
Social Relations section of Elementary School Ob- 
jectives (2). The second was the general theory 
of behavior outlined by Krech and Crutchfield (3), 
particularly those parts of the theory which relat- 
ed to frustration and aggression. We discovered 
that ‘‘good’’ groups, although few in number, did 
exist and were readily distinguishable from ‘‘not 
good’’ groups. To extract the simple variables 
that reflected these differences ina precise man- 
ner proved a lengthy and time-consuming task. 

The November 1953 administration was con- 
ducted to see whether the test would ‘‘work’”’ 
with elementary school children. The results 
were quite satisfactory, children being both inter- 
ested in and motivated by the problems. Follow - 
ing this administration a beginning was made on 
the development of recording techniques which 
would successfully capture, and scoring tech- 
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niques which would accurately reflect, the wide 
differences in the test behavior of different groups. 
Tentative procedures were devised and tried out 
in a retest of the same groups in February 1954. 
Although this retest pointed up innumerable flaws 
in the methods so far developed, sufficient data 
were accum ulated to show that over this three- 
month interval some of the groups had improved 
their test performance, whereas others had dete- 
riorated noticeably. A search for the reasons 
which lay behind such changes led to the formula- 
tion of the following hypothesis: changes in test 
behavior over time are a function of: (1) the nature 
of the teaching methods ateacher employs in her 
classroom, and (2) the kinds of controls the teach- 
er uses to make her methods effective. 

To test this hypothesis, the cooperation of the 
elementary school supervisor in a large city sys- 
tem was solicited. The supervisor was made ac- 
quainted with the test and withour ideas regarding 
what it measured and was asked to select for us 
five pairs of teachers at each of three different 
grade levels, fourth, fifth, and sixth, One mem- 
ber of each pair was to be a teacher who placed 
great emphasis upon teacher-pupil pl anning and 
group work. The other member ofthe pair was to 
be a teacher who placed major emphasis upon indi- 
vidual instruction and who did not use group plan- 
ning techniques with her children. Since we had 
no advance knowledge as to which member of a pair 
was which, wefelt that if the social relations test 
could discriminate between the two it would indi- 
cate that our hypothesis was a tenable one. Of the 
thirty classes tested inthe April 1954 administra- 
tion, twenty-four were predicted correctly. The 
six incorrect predictions were found to be due in 
large part to errors and ambiguities which still 
existed in our recording and scoring procedures. 
Accordingly, all procedures were once again sub- 
jected to a few major and several minor revisions. 
The revised techniques were pretested on classes 
in several near-by county schools and were found 
satisfactory. 

The final step was to determine if others could 
be trained to use the test and obtain results which 
corresponded to ours. To answer this question 
we accepted an invitation to participate in a com- 
prehensive teacher evaluation study being conduct- 
ed by the Division of Teacher Education in New 
York City. Thefifty classes tested in the Novem- 
ber 1954 administration and retested in May 1955 
provided opportunity to experiment with the train - 
ing of four different teams of examiners, one ad- 
ministrator and one observer per team. 

The November administration was dishearten- 
ing, for we found that several of the behaviors to 
be observed and recorded were still not defined 
with sufficient objectivity to render them reliably 
scorable by persons not intimately acquainted 
with the test. The retestinMay, however, proved 
more successful, probably because between the 
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November and May administrations much effort 
was devoted to spelling out in more precise be- 
havioral terms the variables we were interested 
in. It wasfound that a person with the equivalent 
of a bachelor’s degree in education who has had 
some teaching experience can be trained to admin- 
ister the test in approximately two days, working 
with two or three classes each day. Observers, 
however, require approximately four or five days 
of training at the rate of at least two classrooms 
per day. Funds were notavailable for having two 
observers in each classroom, hence reliability 
data could not be collected. The writer worked 
with each observer until an estimated reliability 
of .80 was obtained on at least two classes. The 
observer was then on his own. 

Despite the fact that continuous ex perimental 
revisions prevented our obtaining and reporting a 
series of coefficients descriptive of the reliability 
and validity of the test, it is felt that valuable evi- 
dence of a nonstatistical and subjective type can 
be reported. 


1. Variations in class size ranging between 20 


and 30 seem to have no effect upon chil- 
dren’s test behavior. If a group is larger 
or smaller, test behavior apparently is al- 
tered in direct proportion to the amount of 
departure in size fromthis range. For ex - 
ample, communication is much easier ina 
group of 6 or 7 persons than in a group of 
40 or more persons, but of approximately 
equal difficulty in groups of 20 and groups 
of 30. 


. Persons with some educational background 
can be trained easily to administer the test 
using the standardized procedures which 
are necessary for adequate and reliable as- 
sessment of group behavior. 


. Definite evidence has been collected showing 
that children’s behavior inthe test situation 
reflects insome degree the nature and qual- 
ity of the educational experiences provided 
by the teacher. This finding not only sup- 
ports, but is directly in line with, the re- 
sults of the Lippitt and White experiments 
on the social climate of children’s groups (3). 


. Tentative evidence has been collected in 
support of the hypothesis that the adminis- 
trative set-up of a school is a more im- 
portant determinant of children’s test be- 
havior than the classroom methods em- 
ployed by aparticular teacher. The scores 
of classroom groups of children seem to 
vary less within schools than they do be- 
tween schools. A school with a warm, 
friendly principal, whose teachers appear 
to have a great deal of individual freedom 


and good inter-staff relationships, tends to 
have classrooms which score relatively high 
on the test. A school in which principal and 
teachers are stiff, cold, fearful of the test 
itself, andaggressive toward the examiners 
tends to have classrooms whichscore rela- 
tively low. 


Retesting childrenafter a lapse of from three 
to six months and using the same problems 
has been found to be a feasible procedure. 
The second administration is always given 
with the instruction, ‘‘We would like to see 
how much better you can do this time,’’ and 
with but rare exceptions children enter into 
the task as eagerly and as cooperatively as 
they did at first. Changes in behavior from 
one administration tothe next apparently re- 
flect the nature of the classroom learning 
experiences children have been exposed to 
in the interim, rather than learning how to 
take the test. 


Quite by accident it has been found that in 
many ways the test is as applicable for adult 
groups as for children’s groups and can, with 
appropriate modifications, be scored in the 
same way. We havefound, for example, 
that an adult group unskilled in techniques 
of cooperative group planning scores defi- 
nitely below a third-grade group which has 
received training along these lines, and that 
an adult group highly skilled in such tech- 
niques surpasses a highly skilled children’s 
group only inthe complexity and sophistica- 
tion of their final plan of action. 


The problem of scorer reliability, however, is 
still unsolved and it is felt that, before the tech- 
niques described in the following section can be 
used with confidence, they must be subjected to a 
detailed and rigorous statistical analysis. Re- 
search on this problem is now in progress. 


Recording and Scoring Procedures 


Since the types of scores a situational test pro- 
vides are deter m ined by the specific behaviors 
which are observed and recorded, a first step in 
scoring is that of setting up behavioral definitions 
of the skills the test is designed to measure. 

On the basis of the material presented in the 
section on Social Relations in Elementary School 
Objectives (2), as well as on that found in the lit- 
erature of education and of group dynamics, a 
group skilled in cooperative planning techniques 
can be said to be one characterized by a wide - 
spread, interested exchange of appropriate ideas 
in which definite growth in the direction of great- 
er clarity and precision of thinking occurs as the 
discussion progresses. In addition, the skilled 
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group possesses a substantial amount of autonomy 
in thatthe members can, without external help or 
restraint, conduct themselves during the discus- 
sion in such a way that progress toward their 
goal of devising a mutually satisfactory plan of 
action is implemented rather than impeded. A 
group skilled in techniques of cooperative group 
action can be said to be characterized by a har- 
monious working atmospere in which there is 
widespread and interested concern on the part of 
all members regarding rapid and efficient pro- 
gress toward the group goal. 

Using these definitions as guides, two sets of 
relatively independent variables (one for the Plan- 
ning Stage and one for the Operations Stage) were 
developed for scoring purposes. For the Plan- 
ning Stage the variables are: 


1. Participation—The extent to which individ- 
ual children in the group enter actively in- 
to the planning discussion. In a skilled 
group many children participate; in an un- 
skilled group few children participate. 


2. Involvement—The extent to which children 
exhibit interest in and concern for the task 
at hand. In a skilled group the majority of 
the children evidence an active and sus- 
tained interest in the planning. In an un- 
skilled group interest is quite low and may 
be completely lacking. 


3. Communication— The way in which children 
exchange ideas during the discussion per- 
iod. Ina skilled group children listen to 
each other, critically and constructively 
evaluate each others ideas, and integrate 
simple ideas to build improved and more 
complex ones. In an unskilled group chil- 
dren act independently, and there is no real 
or fruitful exchange of ideas and sugges - 
tions. 


4. Autonomy— The extent to which children can 
discuss the problem and reach a point of 
decision about a final plan without the help 
or restraint of the examiner. The skilled 
group is almost completely independent of 
the examiner. The unskilled group requires 
continuous help and/or restraint to get 
through the planning period. 


5. Organizational Techniques—The kinds of 
ideas children have about ways of organiz - 
ing themselves to attack the problem. Ina 
skilled group the major part of the discus- 
sion is centered around concepts of leader- 
ship and organized subgroups. In an un- 
skilled groupthe discussion centers around 
the general idea of having each child work 
as an individual. 


6. Final Plan—The quality of the plan which the 
children eventually devise. In a skilled 
group the plan is precise and detailed, with 
every child in the group knowing exactly 
where he is to go and what heis to do, In an 
unskilled group the plan is essentially non- 
existent, being so vague and indefinite that 
no child in the group has any clear idea of 
what his responsibility, as an individual, is 
with regard to building the model. 


For the Operations Stage the variables are: 


1. Involvement — The extent to which children 
exhibit interest in and concern for accom- 
plishing the task at hand. In a skilled group 
the majority of the children evince sustained 
interest in the progress being made either 
by themselves or by those who are engaged 
in doing the building. In an unskilled group 
many children withdraw from the field alto- 
gether and pursue non-problem-centered 
activities. 


2. Atmosphere—The psychological tone-q ual- 
ity of the group of children who remain in 
the problem field as revealed by the kinds 
of statements the children make and by the 
tone of voice in which the statements are 
made. In the skilled group, problem-solv- 
ing is carried on in a warm, friendly, sup- 
portive, and harmonious atmosphere. In 
the unskilled group the atmosphere can range 
from an extreme and unnatural quiet to 
screaming excitement or open fighting. 


3. Activity—The kind of behavior children who 
have withdrawn from the problem field en- 
gage in. In the skilled group few or no chil- 
dren withdraw, andthe few who do engage 
in acceptable classroom behaviors. In the 
unskilled group many children tend to with- 
draw, and the majority of them participate 
in highly unacceptable classroom behaviors. 


4. Success— The time necessary for the group 
to complete the task of building the model. 
The skilled group can complete the task in 
ten minutes or less. The unskilled group 
seldom finishes within the allotted fifteen 
minutes, or their behavior becomes so out- 
of-hand that the test must be stopped before 
the fifteen minutes are up. 


Space does not permit a detailed report of the 
standardized directions which have been developed 
concerning the specific behaviors which are ob- 
served, the way in which they are recorded, and 
the way in which they are combined to provide nu- 
merical scores on each of these ten variables. 
For present purposes it must be sufficient to state 
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that on the basis of the data recorded on the ob- 
servation sheets every group taking the test re- 
ceives a score of from 1 to 5 on each variable, 1 
representing the poorest score and 5 represent- 
ing the best score. These scores are then com- 
bined to yield a single descriptive rating which 
provides a meaningful picture of the group’s so- 
cial relations skill. 

In the beginning phases of the research an at- 
tempt was made to combine the six variables of 
the Planning Stage and the four variables of the 
Operations Stage in such a way that groups could 
be ranked from high to low along each of two con- 
tinua designated as skill in cooperative group 
planning and skill in cooperative group action. It 
was soon found, however, that a simple ranking 
of this type disregarded much of the information 
that was deemed educationally and psychologi- 
cally significant. The reason was an obvious one. 
Skill in cooperative group planning does not con- 
sist of one but rather of several interrelated and 
interdependent component skills. The same is 
true of skill in cooperative group action. Groups 
can (and do) vary from each other in any one, two, 
three, or all of these component skills as well as 
in different combinations of them. It became es- 
sential, therefore, to devise a conceptual frame- 
work for describing differences in the behavior of 
groups that could handle variations in several 
variables rather than variations in a single vari- 
able. 

Study of the test performance of many differ - 
ent groups of children revealedthat in each stage 
of the test certain variables acted as limiting fac- 
tors on others. These factors, termed limiting 
variables, were of such a nature that if a group 
received low scores.on them the group was pre- 
vented from receiving high scores on any of the 
other non-limiting variables. The converse was 
not true. Groups who received high scores on 
these limiting variables did not necessarily re- 
ceive high scores on the non-limiting ones, but 
could range in score from very poor to very good. 
This finding eventually led to the postulation of 
two axes, one composed of the limiting variables 
and the other composed of the non-limiting ones. 
These axes were then alignedin such a way that a 
group’s position on the one determined the range 
of positions it could occupy on the other. 

Examination of the psychological nature of the 
variables which determine a group’s position on 
these two axes revealed that the limiting variables 
were related to the amount and kind of discipline 
children exhibit in the test situation. The non- 
limiting variables concerned the amount of tech- 
nical knowledge children seemed to possess about 
how to organize themselves to accomplish a group 
task. The reason for the somewhat peculiar re- 
lationship between the several variables which go 
to make up skill in social relations now becomes 
clear. The technical knowledge which members 


of a group possess cannot be put to use unless 
members also possess an adequate amount of self- 
discipline. That is, children (and adults too, for 
that matter) might be aware of what ‘‘ought’’ to be 
done in order best to accomplish a specific group 
goal, but, unless each individual conducts himself 
in a manner conducive to such accomplishment, 
optimum progress toward the goal becomes im - 
possible. On the other hand, effective sel f-dis- 
cipline on the part of group members cannot, in 
and of itself, serve to make up for lack of the tech- 
nical ‘‘know-how’’ necessary for maximum group 
efficiency. The extent to which a group is able to 
utilize effectively the technical knowledges and 
skills which its members possess is thus limited 
by the extent to which the members are self-dis- 
ciplined. 

Figure 2 shows the way in which two axes are 
aligned in the Planning Stage and in the Operations 
Stage of the test. The horizontal axis has been 
designated Discipline, and a group’s placement on 
it is determined by scores received on the limit- 
ing variables in each stage. The vertical axis 
which bisects it has been designated Efficiency 
(Organizational Efficiency in the Planning Stage 
and Operational Efficiency in the Operations Stage). 
A group’s placement on this axis is determined by 
the non-limiting variables in each stage. The 
dashed restraining lines in the figure are purely 
illustrative, serving to indicate that a group’s dis- 
tance in either directionfrom the mid-point of the 
horizontal axis limits, but does not wholly deter- 
mine, the height to which the group can rise on 
the vertical axis. 

In the Planning Stage the variables Involvement 
and Autonomy were found to be limiting ones. In- 
volvement, it will be recalled, is a measure of to- 
tal group interest, and Autonomy is a measure of 
the extent to which a group can function independ- 
ently of the examiner. Only a group whose mem- 
bers possess a high degree of self-discipline has 
the ability to get th rough the planning discussion 
without examiner interference. Group members 
will maintain interest in the planning discussion 
only to the extent that they feel themselves to be 
needed, wanted, and important members whose 
ideas are worthy of attention and consideration by 
others. The more this feeling is present, the 
more self-disciplined a group tends to be and the 
more it is able to function independently. Simi - 
larly, the more self-disciplined a group is, the 
more its members tend to consider, as well as 
seek, the ideas and opinions of others. Autonomy 
and Involvement are thus seen to be inextricably 
related, and without these two conditions the prof- 
itable exchange of ideas necessary to good group 
planning is impossible. 

As Figure 2 suggests, agroup’s lack of Auton- 
omy in the Planning Stage can manifest itself in 
different ways. Groups can be un-disciplined in 
the sense that all forms of control, both external 
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and internal, are absent, or they can be over- 
disciplined in the sense that external controls 
have been so strongly imposed that the develop- 
ment and/or functioning of self-discipline is not 
possible. In the test situation the completely un- 
disciplined group is a rowdy, noisy, restless one 
in which children pay little or no attention either 
to the examiner or to the task at hand. The ex- 
tremely over-disciplined group, onthe other hand, 
is one in which children are so restrained and 
passive that the examiner can elicit almost noth- 
ing from them. “othgroups, inthe extreme case, 
are almost totally lacking in interest and involve- 
ment in the test problems, and this combination 
of factors prevents them from func tioning effi- 
ciently during the planning period. 

In the Operations Stage of the test the limiting 
variables were found to be Atmosphere and Activ- 
ity. Atmosphere refers to the psychological 
tone-quality of the group of children who remain 
in the problem field and who exhibit concern for 
the construction task. Activity refers to the kind 
of behavior children who have withdrawn from the 
problem field engage in. A harmonious working 
or problem-solving atmosphere in which optimum 
progress is possible can, in the present test sit- 
uation, be created and maintained only if each 
group member subjects himselftothe rules of the 
plan devised during the discussion period and re- 
frainsfrom engaging in any behavior which will 
impede progress toward the group goal. Group 
members who have withdrawn from the problem 
field must likewise refrain from engaging in be- 
haviors which are so rowdy or so hostile that they 
disrupt the work in progress. Atmosphere and 
Activity, therefore, are concerned with the amount 
of self-discipline group members possess, and 
directly affect the efficiency with which the group 
as a whole can accomplish its task. 

Lack of self-discipline in the Operations Stage 
results in one of two different types of behavior, 
both of which are equally detrimental to efficient 
group accomplishment. As shown in Figure 2, 
children who are undisciplined can evince hostile 
or non-hostile behaviors. In the extreme case 
the undisciplined non-hostile groupis one in which 
the classroom is turned into a playground, chil- 
dren running about with complete abandon. On 
the other hand, the group in which members ex- 
hibit extreme hostility towards each other is one 
in which angry quarreling and open fighting pre- 
dominate. In both groups ope rational efficiency 
is so low that it frequently is necessary to stop 
the test within the first few minutes. 

All four of the non-limiting variables which de- 
termine a group’s placementonthe Organization- 
al Efficiency line in the Planning Stage relate to 
those technical skills and knowledges essential to 
building a good plan of action. Since each child 
in the room has one or two of the blocks neces- 
sary to build the puzzle, it is obvious that some 


type of organization is essential to efficient ac- 
complishment of the constructiontask. If children 
have no knowledge of organizational techniques 
such as those of delegating leadership, appointing 
subgroups to perform certain tasks, or specifying 
when and where different groups are to work, a 
good final plancannot be built. Possession of such 
knowledge, however, is not by itself sufficient. 
Children must also possess the ability to express 
their ideas, to communicate them to others, to 
listen to the comments and criticisms which others 
might make, and to engage in that fruitful ex- 
change of ideas and points of view which results 
in the most precise, the mostcomprehensive, and 
the most detailed plan of action. The extent to 
which group members possess these knowledges 
and skills is reflected by scores on the variables 
Communication, Participation, Ideas, and Final 
Plan. 

In the Operations Stage the non-limiting vari- 
ables are Success and Involvement. Success is a 
highly objective measure of efficiency based upon 
the time it takes the children to build the model; 
it is dependent upon both the excellence of the plan 
and the discipline with which it is carried to com- 
pletion. Involvement, although still a measure of 
total group interest, is not a limiting variable in 
this stage as itis inthe Planning Stage. This is 
because skill in techniques of c ooperative action 
is revealed in the extent to which children, when 
placed in a completely free and unstructured situ- 
ation, can exhibit and maintain genuine concern 
for accomplishing a group goal. This concern 
may be manifested directly or indirectly. Direct 
Involvement is exhibited by children who remain 
in the vicinity of the problem table, following 
closely the progress being made on the task, and 
also by children who, although drawing slightly 
apart from the builders, obviously keep an eye on 
the progress being made. Indirect Involvement is 
evidenced by children who, quietly confident that 
in the Planning Stage they, as a group, have made 
the best possible plan of action, pursue other on- 
going classroom activities andleave those few who 
have been delegated to perform the construction 
task to carry out their responsibility. 

It is possible for agroupto receive ahigh score 
on Success (indicating that the problem was solved 
in a relatively short period of time) because the 
majority of the children refused to accept the prob- 
lem and withdrew from the field altogether, leav- 
ing the job in the hands of three or four group 
members. In such cases the problem is obvious - 
ly no longer a ‘‘group’’ problem and the total 
group’s rating on Operational Efficiency is low de- 
spite their high Success score. Success, meas- 
ured in number of minutes it takes to solve the 
problem, thus becomes a meaningful measure of 
‘*skill in cooperative group action’’ only when the 
Involvement score is high, indicating that the ma- 
jority of group members remained interested in 
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the accomplishment of the task. For this reason, 
Operational Efficiency must be determined by 
scores on both Success and Involvement variables. 

To summarize, agroup’s placement on the dis- 
cipline axis is determined in the Planning Stage 
by scores on Autonomy and Involvement, and in 
the Operations Stage by scores on Atmosphere and 
Activity. Placement on the Organizational Effi- 
ciency continuum is determined by scores on Par- 
ticipation, Communication, Ideas, and Plan, and 
on the Operational Efficiency continuum by scores 
on Involvement and Success. 


Classification of Groups on the Basis 
of Test Behavior 


Every group that takes thetest is classified on 
two bases: (1) their Planning Stage behavior, and 
(2) their Operations Stage behavior. This classi- 
fication is made using the theoretical framework 
described above for summarizing the data obtained 
for a group. For ease of communication, each 
‘‘group-type’’> has been assigned a descriptive 
name, as shown in Figures 3 and 4. The dividing 
lines in these figures are by no means rigid, serv- 
ing only to illustrate the fact that the defined areas 
represent points of concentration in a plane. All 
of these group-types have been observed and re- 
corded empirically. Their major characteristics 
are given below. 


Group-Types in the Planning Stage 


Figure 3 shows the seven major types of groups 
that have been defined for the Planning Stage of 
the test. The characteristics of each type is as 
follows: 


1. The Mature Group—Characterized by good 
ideas about ways of organizing themselves 
and by an excellent communication pattern. 
The discussion is widespread, animated, 
self-controlled and well conducted. Fre- 
quently the examiner can retire altogether, 
leaving the total planning session in the 
hands of one or two student leaders. 


2. The Dependent Group—Characterized by 
good ideas about ways of organizing them- 
selves and by agoodcommunication pattern. 
This group, however, possesses less auton- 
omy than the MatureGroup, and to get them 
through the discussion the examiner has to 
provide frequent assistance in the form of 
summarizing ideas, suggesting next steps, 
reminding them to speak one at a time, etc. 


3. The Immature Group—Characterized by an 

alert and eager interest in the test, com- 
bined with an almost total lack of skill in 
group problem-solving procedures. Ideas 
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about organizational techniques are to all 
practical purposes non-existent and the 
group generally accepts by consensus the 
first or second idea mentioned, regardless 
of its feasibility. The active interest, how- 
ever, serves to keep the group well disci- 
plined and little in need of examiner aid or 
restraint. 


4. The Semi-Controlled Group—Characterized 
by fairly wide andgenerally interested par- 
ticipation. The communication pattern is 
relatively good, as is knowledge of organi- 
zational techniques and quality of the final 
plan. Group discussion is markedly imped- 
ed by the fact that group members tend to 
talk ali at once. The examiner is always 
hard put to keep the classroom sufficiently 
quiet andthe discussion sufficiently central- 
ized so that childrencanhear each other and 
really develop their ideas. 


5. The Semi-Restrained Group—C har ac ter - 
ized by an atmosphere of “dutiful recita- 
tion.’’ Children are quiet, polite, attentive, 
and respond readily to the examiner’s re- 
quests. Their ideas aregenerally good, but 
the planning discussion tends to be stiff and 
artificial, as though children were doing 
what they feel they ‘‘ought’’ to do rather than 
entering freely, wholeheartedly, and with 
enjoyment into the task at hand. The dis- 
cussion is seldom widespread; it is gener - 
ally carried by one, two, or three children, 
the rest remaining quietly passive and disin- 
terested. 


6. The Uncontrolled Group—Characterized by 
a complete lack of discipline, combined with 
a total lack of skill in techniques of planning 
and communication. 


7. The Restrained Group—Characterized by an 
almost complete and highly unnatural si- 
lence, combined with zero skill in techniques 
of planning and communication. The exam- 
iner must work very hard with this group, 
prodding them constantly to get them to re- 
spond at all. 


Group- Types in the Operations Stage 


In the Operations Stage, as inthe Planning 
Stage, the results of many administrations of the 
test have shown that groups tend to cluster ina 
few areas along the two axes, Discipline and Op- 
erational Efficiency. As shown in Figure 2, lack 
of self-discipline in this stage of the test can be 
evidenced in either hostile or non-hostile behav- 
ior. The type of behavior exhibited serves to lo- 
cate a group on the right or left side of the tri- 
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angle. Figure 4 shows the nine general group- 
types that have been observed. In this figure, 
groups falling on theleft side of the triangle are 
non-hostile groups, those falling on the right side 
are hostile ones, those inthe center are best de- 
scribed as task-centered. The lower a group’s 
position on the triangle, the less efficient it is as 
an operating unit, and the further away it is from 
the center line, the less disciplinedare its mem- 
bers. As in the Planning Stage, we againfind that 
the amount and kindof self-discipline group mem- 
bers possess serves to limit their efficiency in 
accomplishing a particular task. 

The nine types of groups shown in Figure 4 are 
briefly described as follows: 


Task-Centered Groups 


1. The Mature Group—This group is character- 
ized by a warm friendly, supportive atmos- 
phere in which all members exhibit sustained 
interest in accomplishing the task rapidly and 
efficiently. Once children have performed 
jobs assigned them, they either socialize free- 
ly, but quietly, until. the task is completed or 
pursue ongoing classroom activities. This 
group completes the task in approximately ten 
minutes or less. It is distinguished from all 
other groups by the fact that those children not 
directly concerned with the task of putting 
blocks together still remain involved in the 
work. 


. The Immature Group—The atmosphere of this 
group, although basically warm and friendly, 
is best described as ‘‘bumbling.’’ Children 
exhibit a sincere concern for accomplishing the 
task, but their lack of technical skill in build- 
ing an efficient planof action renders their ef - 
forts ineffective. The majority mill around 
the problem table, putting blocks together in a 
more or less random fashion, trading blocks, 
taking them apart and reassembling them, us- 
ually to little avail. The warmth and friendli- 
ness of the atmosphere, however, prevents the 
group from breaking up despite their frustrat- 
ing trial-and-error methods. This group rare- 
ly completes the problem in the allotted time, 
usually being less than three-fourths finished 
when the fifteen minutes are up. 


. The Disinterested Group— This group is char- 
acterized by extremely low involvement on the 
part of most members in the class. Because 
only a small group work on the problem, the 
working atmosphere is generally a quiet one 
and success, measured by the time necessary 
to complete the task, is high. Since the prob- 
lem is no longer a group problem, the over- 
all rating on Operational Efficiency is near 
zero despite the high degree of objective 


achievement. This group is distinguishedfrom 
the Rowdy and the Quarreling Groups by the 
fact that children who have withdrawn pursue 
acceptable (e.g., non-disruptive) classroom 
activities. 


Non-Hostile Groups 


. The Rollicking Groups—C haracterized by an 


atmosphere which, although basically friendly, 
is loud, noisy and generally uproarious. Chil- 
dren not directly concerned with the building 
task crowd around the participants shouting di- 
rections, making joking comments about pro- 
gress, and laughing loudly over any errors 
which are made. The melee generally becomes 
too much for some members, and they with- 
draw from it altogether—not in anger but for 
self-preservation. The uproar andlackof ser- 
iousness lowers this group’s over-all efficiency. 


The Excited Group—Children in this group are 


unable to control their excitement over the task. 
The atmosphere is charged with tenseness, 
voices aresharp and high, and movements are 
jerky as children work feverishly at the build- 
ing task. In the excitement tempers become 
short and arguments frequent, but real hostil- 
ity is absent as shown by the fact that outbursts 
are task-centered, rather than personal, and 
are short-lived. The noise, tenseness, and 
excitement prevent children from achieving a 
high success score, although they frequently 
manage to finish before the allotted time is up. 


. The Rowdy Group—This group is similar to the 


Excited Group, except that here all forms of 
control are so lacking that the excitement gets 
completely out-of-hand. The screaming, run- 
ning, shouting, and yelling reach such a pitch 
that the examiner frequently is forced to stop 
the test. This group’s efficiency is always 
near zero. 


Hostile Groups 


The Suppressed Group—C harac terized by an 


almost abnormal quiet. Children engaged in 
the building task speak in whispers or in a low 
murmur, and it is rarely possible to hear spe- 
cific statements. Children not engaged in the 
building task sit quietly and idly in their seats 
doing nothing. 


The Bickering Group— This group is character- 


ized by an atmosphere of contained hostility. 
There is a continuous flow of negatively criti- 
cal arguments about who should be at the prob- 
lem table and how the puzzle ought to be built. 
The hostility apparent in these arguments is 
usually sufficient to cause many children to 
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withdraw from the field altogether with the at- 
titude: ‘‘They think they’re so smart—let them 
do it.’ Those who remain continue to bicker 
about how the task should be done; those who 
withdraw lose all interest in the work, sitting 
quietly (though often sullenly) attheir desks or 
engaging in some type of play activity. Fre- 
quently, children hide their blocks so that the 
builders cannot complete their task. 


9. The Quarreling Group—In this group the hos- 
tility gets completely out-of-hand. Quarreling 
over possession of blocks and fighting for po- 
sitions at the problem table generally reach the 
point at which the test has to be stopped. 


Changes in Test Performance Over Problems 


The present test, being concerned with the 
measurement of dynamic social and psychological 
forces which reveal themselves in changes in be- 
havior rather than in consistency of behavior, 
does not possess reliability inthe sense of having 
high inter-problem (e.g., inter-item) correla- 
tions. As a group progresses from the Planning 
Stage of problem one to the Operations Stage, and 
from there to the Planning Stage of problem two, 
etc., thenature of the psychological forces active 
in that particular group are revealed by the pat- 
tern of changes which occur in test behavior. It 
is hypothesized that ifthese forces are predomin- 
antly constructive in nature, the group will be - 
come more cohesive and more efficient over prob- 
lems. Thus a group that scores Immature on 
problem one planning and Mature on problem one 
operations, generally scores Dependent or Semi- 
Controlled in subsequent planning sessions. If, 
on the other hand, these forces tend to be predom- 
inantly destructive in nature, the group will show 
a more Or less rapid trend toward disintegration 
and inefficiency, the strength of these forces be- 
ing revealed in the rapidity with which the group 
reaches the point of complete disintegration. 

The explanation for these phenomena is based 
upon two assumptions. First, every group, re- 
gardless of the attributes or past experiences of 
its members, contains within it potentialities for 
both constructive and destructive group action. 
Second, the direction which these potentialities 
ultimately take is determined primarily by the ex- 
periences to which the members—as a group— 
have been subjected. Accordingly, groups in 
which constructive forces are stronger than de- 
structive ones are hypothesized to be those in 
which children have been exposed to many happy 
and constructive working experiences in a group 
situation where respect for. the abilities, needs, 
and feelings of others is the rule. Groups in 
which destructive forces are paramount are hy- 
pothesized to be ones in which primary emphasis 
has been placed upon individual work and individ - 
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ual accomplishment. Children in these groups 
lack the kind of experiences in which social rela- 
tions skills may be learned, and thus fail in their 
efforts when suddenly p| aced in a situation which 
demands such skills. Groups in which destruc - 
tive forces show themselves in hostile behaviors 
are believed to be ones in which children’s exper- 
iences have been such as togenerate marked feel- 
ings of rivalry, jealousy, bitterness, and frustra- 
tion. Those in which destructive forces show 
themselves in rowdy, although non-hostile, behav- 
iors are believed to be ones in which children have 
not been taught to handle theirfreedom wisely and 
constructively. There is no frustration, only a 
boredom which breaks into excited and welcomed 
horseplay when external controls are released. 

These and similar hypotheses are as yet quite 
tentative, as are many of the findings here report - 
ed. Muchresearchof a rigorous nature is needed 
before the test can be used with confidence and 
with accuracy. Research on reliability is now in 
progress, further use of the test hinging upon the 
outcome. 


Summary 


This paper describes a situational test useful 
for evaluating the social relations skills of ele- 
mentary school childrenina group problem-solv- 
ing situation. Dataaccumulated during the devel- 
opment of the instrument indicate that the obser- 
vation and scoring procedures here reported pos- 
sess satisfying reliability, but the rigorously de- 
signed study which will provide definitive answers 
to such questions has not been completed. The 
test is presented as a research technique which 
might prove valuable in studying the effects of var- 
ious kinds of educational environments upon the 
behavior of children’s groups. 
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5. The average number of copies of each issue of this publication sold or distributed, through the mails or otherwise, to 
paid subscribers during the 12 months preceding the date shown above was: (This information is required from daily, 


itor, publisher, business business manager, 


weekly, semiweekly, and triweekly newspapers only.) 


Sworn to and subscribed before me this 


| SEAL] 


September 0 1992 
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